Page MenuHomePhabricator

[IR] `GetElementPtrInst`: per-index `inrange` support
Changes PlannedPublic

Authored by lebedev.ri on Dec 2 2021, 12:24 PM.

Details

Summary

This is a follow-up for D22793, and the [llvm-dev] [RFC] Adding range metadata to array subscripts.: https://groups.google.com/g/llvm-dev/c/T9o51zB1JHY

As per C 6.5.6p9 / http://eel.is/c++draft/expr.add#4, given

struct S {
    int a[3];
    int b[3];
    int c[3];
};

void bar(int*);

void foo(S* s) {
  bar(&s.b[1]);
}

even though the pointer the bar receives has 4 ints to the left of it and 4 to the right of it,
the only ints it can access are one to the left and one to the right.
I.e. it can not go outside of the S::b.

But, there is currently no way to represent this in LLVM IR,
and this makes some optimizations impossible.
In my case, i'd like to improve alloca splitting,
but this also affects alias analysis.

While that RFC proposed a number of implementation variants,
i think it is obvious that the inrange approach is superiour,
because constant expr GEP's already support that,
and because encoding such knowledge via assumes will make it impossible to make use of it.

Now, GEP's can have a large number of indices,
so even though we have (7-1)+(16-1)=21 vacant bits total in SubclassOptionalData and SubclassData,
we have to support storing the arbitary number of is index inrange bits,
so we have to allocate additional trailing objects.
(I suspect there are bugs in this logic, will add better tests.)

I will add clang support in later changes.

Diff Detail

Unit TestsFailed

TimeTest
170 msx64 debian > Clang.OpenMP::reduction_implicit_map.cpp
Script: -- : 'RUN: at line 4'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/14.0.0/include -nostdsysteminc -verify -fopenmp -fopenmp-cuda-mode -x c++ -triple powerpc64le-unknown-unknown -DCUDA -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/reduction_implicit_map.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/OpenMP/Output/reduction_implicit_map.cpp.tmp-ppc-host.bc

Event Timeline

lebedev.ri created this revision.Dec 2 2021, 12:24 PM
lebedev.ri requested review of this revision.Dec 2 2021, 12:24 PM
nikic requested changes to this revision.Dec 2 2021, 12:58 PM

While I support the general goal of exposing GEP offset restrictions to IR, I am quite strongly opposed to the implementation approach of extending inrange. The core issue is that this is strongly tied to LLVM struct types and structural GEP indexing. This will be a blow to opaque pointer usefulness and future offset canonicalization for GEPs.

I think the correct approach to inrange-like information is to restrict the range of GEP indices without relying on the underlying structure type. Think inrange(0, 4) i32 %x for %x between 0 and 4. This naturally integrates in purely offset-based alias analysis, and can be more generally preserved under transformation. For example, if you have gep %base, inrange (%x + 1), if this is transformed into gep (gep %base, 1), %x there is no way to preserve inrange information under the current proposal, while a proper offset-based approach could easily retain information under simple transformations.

This revision now requires changes to proceed.Dec 2 2021, 12:58 PM
lebedev.ri requested review of this revision.Dec 2 2021, 1:17 PM

While I support the general goal of exposing GEP offset restrictions to IR,
I am quite strongly opposed to the implementation approach of extending inrange.
The core issue is that this is strongly tied to LLVM struct types and structural GEP indexing.
This will be a blow to opaque pointer usefulness and future offset canonicalization for GEPs.

While i'm certainly sympathetic to the opaque pointer future,
i'd also like to remind that they are just a tool.

Concretely, can you quote anything that says that in the opaque pointer future,
the only GEP that will remain will only be able to apply a byte offset to the pointer,
i.e. there won't be GEP's into structs/multiple indices?

I think the correct approach to inrange-like information is to restrict the range of GEP indices
without relying on the underlying structure type. Think inrange(0, 4) i32 %x for %x between 0 and 4.
This naturally integrates in purely offset-based alias analysis, and can be more generally preserved under transformation.
For example, if you have gep %base, inrange (%x + 1), if this is transformed into gep (gep %base, 1), %x
there is no way to preserve inrange information under the current proposal,
while a proper offset-based approach could easily retain information under simple transformations.

I think you are missing the whole point there. It is explicitly NOT the point of this patch
to be able encode that some index must take values in range of [x, y). So if that is your proposal,
while it may be interesting, it's explicitly inferior, and does not solve the motivational case.

The reason being, it encodes *VERY* different semantics.
If we've encoded that in

struct S {
    int a[3];
    int b[3];
    int c[3];
};

void bar(int*);

void foo(S* s, int i) {
  int* p = &s.b[i];
  bar(p);
  int* p2 = p + 4; // UB!
  bar(p);
}

the variable i of function foo must be [0, 3),
that only tells us that p is pointing somewhere in (int*)s + 3 + [0, 3).
What it does not encode is that, given that pointer, we can not go outside of that array,
i.e. that auto* p2 = p + 4; is UB.

nikic added a comment.Dec 2 2021, 2:18 PM

I think you are missing the whole point there. It is explicitly NOT the point of this patch
to be able encode that some index must take values in range of [x, y). So if that is your proposal,
while it may be interesting, it's explicitly inferior, and does not solve the motivational case.

Right, sorry. I thought you wanted to solve the same issue discussed in the llvm-dev thread, but I see that you're more interested in SROA than AA here. I don't think this materially changes my point though. Rather than restricting the range of an offset, you are instead restricting which offsets of the GEP base can be accessed (either through this GEP, or a later one). That's still something that can be expressed in terms of explicit offsets rather than basing it on struct types. One could argue that this is really a "restrict object" operation that is largely independent from the GEP arithematic and could e.g. be represented as an intrinsic returning a restricted pointer. But I understand that encoding this in the GEP makes this more viable.

Concretely, can you quote anything that says that in the opaque pointer future,
the only GEP that will remain will only be able to apply a byte offset to the pointer,
i.e. there won't be GEP's into structs/multiple indices?

See this thread: https://groups.google.com/g/llvm-dev/c/U7D6z7ZnKy8/m/2-xy5zPcBAAJ Under opaque pointers, we are currently representing constant offset GEPs as i8 GEPs, because this makes things a lot simpler -- with opaque pointers type information tends to not be preserved naturally, and you have to go out of your way, and in some cases use heuristics, to preserve type information. And apart from inrange, doing so is entirely wasted effort. For example, SROA will just generate i8 GEPs instead of doing a complicated dance trying to find a minimal natural-looking GEP.

I think you are missing the whole point there. It is explicitly NOT the point of this patch
to be able encode that some index must take values in range of [x, y). So if that is your proposal,
while it may be interesting, it's explicitly inferior, and does not solve the motivational case.

Right, sorry. I thought you wanted to solve the same issue discussed in the llvm-dev thread, but I see that you're more interested in SROA than AA here.

Yup.

I don't think this materially changes my point though. Rather than restricting the range of an offset,
you are instead restricting which offsets of the GEP base can be accessed (either through this GEP,
or a later one). That's still something that can be expressed in terms of explicit offsets rather than
basing it on struct types.

I agree that we can express a number of cases with explicit offsets.
Presumably, we'd even be fine with having a number of these inrange(C0, C1) in a single GEP.
But, what happens for e.g.

struct S {
  int array[4][4];
};
int& foo(S*s, int i, int j) {
  return s.array[i][j];
}

?
We can annotate that we don't escape from the entire array (i.e. [0, 128)),
but if i is not a constant, how are we going to annotate the inner GEP?
Should we require that there be two GEP's there, and forbid their fusion?

One could argue that this is really a "restrict object" operation
that is largely independent from the GEP arithematic and could e.g. be represented as
an intrinsic returning a restricted pointer. But I understand that encoding this in the GEP makes this more viable.

Concretely, can you quote anything that says that in the opaque pointer future,
the only GEP that will remain will only be able to apply a byte offset to the pointer,
i.e. there won't be GEP's into structs/multiple indices?

See this thread: https://groups.google.com/g/llvm-dev/c/U7D6z7ZnKy8/m/2-xy5zPcBAAJ Under opaque pointers,
we are currently representing constant offset GEPs as i8 GEPs, because this makes things
a lot simpler -- with opaque pointers type information tends to not be preserved naturally,
and you have to go out of your way, and in some cases use heuristics, to preserve type information.
And apart from inrange, doing so is entirely wasted effort.
For example, SROA will just generate i8 GEPs instead of doing a complicated dance trying to find a minimal natural-looking GEP.

Note that i'm mostly interested in annotating the outermost variable GEP,
since that is what will limit the alloca splitting. Constant indices are obvious,
and any indices after the first variable indice won't be relevant.

lebedev.ri planned changes to this revision.Dec 3 2021, 8:53 AM

Ok, i guess i have a rather more general and radical, invasive solution, let's see if that works out first.

Consider the following:

struct S {
    int a[3], b[3], c[3];
};

int foo(S* s) {
  return &s.a[3] == &s.b[0];
}

The C standard says this returns 1; if you lower the address computation using inrange, it returns poison. I think? Maybe you can apply inrange markings in some cases, e.g. you can be more aggressive if you prove the address is used by a load/store operations. But it's not trivial.

Also, it's not obvious inrange is sound; nobody has tried to formally model the pointer provenance or pointer comparison rules specified for inrange.

nlopes added a comment.Dec 5 2021, 7:41 AM

(thinking out loud)
The main issue seems to be that C's memory model allows field-sensitive AA, while LLVM does not. But each field is not an object on its own, it's more like a sub-object. In C, you can memcpy a whole struct, as it's an object, but once you index into a specific filed you can't dereference another field.
It's as if the struct is an object and each field is a sub-object.

Re inrange, currently it doesn't disallow OOB pointers; it just says that dereferencing such pointers triggers UB. Eli's example is actually fine with the current wording. It's just that the current wording is not consistent with the rest of LangRef; it requires additional machinery in the memory model.
Other 2 possibilities that come to mind are TBAA and the proposal to handle C's restrict that pushes provenance information to memory operations rather than having it in GEPs. Though it's unclear to me whether tagging memory operations is even possible.

In the past, Johannes proposed using range information for the variables in GEP (using the assumptions machinery already available). That could work, but we need to make sure it doesn't block optimizations. We don't LLVM to trigger UB if an array index is OOB; we just need it to turn poison. The current ranges trigger UB AFAIR.

IMHO, we need a group of people to commit to studying the different features that are needed and make a consistent design to avoid the current patchwork we have. Not an easy feat, though.

In the past, Johannes proposed using range information for the variables in GEP (using the assumptions machinery already available). That could work, but we need to make sure it doesn't block optimizations. We don't LLVM to trigger UB if an array index is OOB; we just need it to turn poison.

Note that since D109746, BasicAA already uses ranges information to derive NoAlias. This is an example where the range information can help independently of proving UB. For now this only works when the compiler can otherwise prove the range, e.g. for

struct Histogram {
  int buckets[4];
  int count;

  void f(unsigned i) {
    ++buckets[i & 3];
    ++count;
    // buckets[i&3] and count are noalias.
  }
};

When clang starts emitting range information (in any form recognizable by ValueTracking, I don't have a strong opinion on which for is best), we can start having buckets[i & 3] and count be NoAlias more generally.