diff --git a/llvm/docs/GetElementPtr.rst b/llvm/docs/GetElementPtr.rst --- a/llvm/docs/GetElementPtr.rst +++ b/llvm/docs/GetElementPtr.rst @@ -88,15 +88,15 @@ .. code-block:: llvm - define void @munge(%struct.munger_struct* %P) { + define void @munge(ptr %P) { entry: - %tmp = getelementptr %struct.munger_struct, %struct.munger_struct* %P, i32 1, i32 0 - %tmp1 = load i32, i32* %tmp - %tmp2 = getelementptr %struct.munger_struct, %struct.munger_struct* %P, i32 2, i32 1 - %tmp3 = load i32, i32* %tmp2 + %tmp = getelementptr %struct.munger_struct, ptr %P, i32 1, i32 0 + %tmp1 = load i32, ptr %tmp + %tmp2 = getelementptr %struct.munger_struct, ptr %P, i32 2, i32 1 + %tmp3 = load i32, ptr %tmp2 %tmp4 = add i32 %tmp3, %tmp1 - %tmp5 = getelementptr %struct.munger_struct, %struct.munger_struct* %P, i32 0, i32 0 - store i32 %tmp4, i32* %tmp5 + %tmp5 = getelementptr %struct.munger_struct, ptr %P, i32 0, i32 0 + store i32 %tmp4, ptr %tmp5 ret void } @@ -108,11 +108,11 @@ .. code-block:: text - %MyVar = uninitialized global i32 + @MyVar = external global i32 ... - %idx1 = getelementptr i32, i32* %MyVar, i64 0 - %idx2 = getelementptr i32, i32* %MyVar, i64 1 - %idx3 = getelementptr i32, i32* %MyVar, i64 2 + %idx1 = getelementptr i32, ptr @MyVar, i64 0 + %idx2 = getelementptr i32, ptr @MyVar, i64 1 + %idx3 = getelementptr i32, ptr @MyVar, i64 2 These GEP instructions are simply making address computations from the base address of ``MyVar``. They compute, as follows (using C syntax): @@ -125,12 +125,12 @@ Since the type ``i32`` is known to be four bytes long, the indices 0, 1 and 2 translate into memory offsets of 0, 4, and 8, respectively. No memory is -accessed to make these computations because the address of ``%MyVar`` is passed +accessed to make these computations because the address of ``@MyVar`` is passed directly to the GEP instructions. The obtuse part of this example is in the cases of ``%idx2`` and ``%idx3``. They result in the computation of addresses that point to memory past the end of the -``%MyVar`` global, which is only one ``i32`` long, not three ``i32``\s long. +``@MyVar`` global, which is only one ``i32`` long, not three ``i32``\s long. While this is legal in LLVM, it is inadvisable because any load or store with the pointer that results from these GEP instructions would produce undefined results. @@ -145,22 +145,21 @@ .. code-block:: text - %MyStruct = uninitialized global { float*, i32 } + %MyStruct = external global { ptr, i32 } ... - %idx = getelementptr { float*, i32 }, { float*, i32 }* %MyStruct, i64 0, i32 1 + %idx = getelementptr { ptr, i32 }, ptr %MyStruct, i64 0, i32 1 -The GEP above yields an ``i32*`` by indexing the ``i32`` typed field of the +The GEP above yields a ``ptr`` by indexing the ``i32`` typed field of the structure ``%MyStruct``. When people first look at it, they wonder why the ``i64 0`` index is needed. However, a closer inspection of how globals and GEPs work reveals the need. Becoming aware of the following facts will dispel the confusion: -#. The type of ``%MyStruct`` is *not* ``{ float*, i32 }`` but rather ``{ float*, - i32 }*``. That is, ``%MyStruct`` is a pointer to a structure containing a - pointer to a ``float`` and an ``i32``. +#. The type of ``%MyStruct`` is *not* ``{ ptr, i32 }`` but rather ``ptr``. + That is, ``%MyStruct`` is a pointer (to a structure), not a structure itself. #. Point #1 is evidenced by noticing the type of the second operand of the GEP - instruction (``%MyStruct``) which is ``{ float*, i32 }*``. + instruction (``%MyStruct``) which is ``ptr``. #. The first index, ``i64 0`` is required to step over the global variable ``%MyStruct``. Since the second argument to the GEP instruction must always @@ -181,39 +180,31 @@ .. code-block:: text - %MyVar = uninitialized global { [40 x i32 ]* } + @MyVar = external global { i32, ptr } ... - %idx = getelementptr { [40 x i32]* }, { [40 x i32]* }* %MyVar, i64 0, i32 0, i64 0, i64 17 + %idx = getelementptr { i32, ptr }, ptr @MyVar, i64 0, i32 1 + %arr = load ptr, ptr %idx + %idx = getelementptr [40 x i32], ptr %arr, i64 0, i64 17 -In this example, we have a global variable, ``%MyVar`` that is a pointer to a -structure containing a pointer to an array of 40 ints. The GEP instruction seems -to be accessing the 18th integer of the structure's array of ints. However, this -is actually an illegal GEP instruction. It won't compile. The reason is that the -pointer in the structure *must* be dereferenced in order to index into the -array of 40 ints. Since the GEP instruction never accesses memory, it is -illegal. +In this example, we have a global variable, ``@MyVar``, which is a pointer to +a structure containing a pointer. Let's assume that this inner pointer points +to an array of type ``[40 x i32]``. The above IR will first compute the address +of the inner pointer, then load the pointer, and then compute the address of +the 18th array element. -In order to access the 18th integer in the array, you would need to do the -following: +This cannot be expressed in a single GEP instruction, because it requires +a memory dereference in between. However, the following example would work +fine: .. code-block:: text - %idx = getelementptr { [40 x i32]* }, { [40 x i32]* }* %, i64 0, i32 0 - %arr = load [40 x i32]*, [40 x i32]** %idx - %idx = getelementptr [40 x i32], [40 x i32]* %arr, i64 0, i64 17 - -In this case, we have to load the pointer in the structure with a load -instruction before we can index into the array. If the example was changed to: - -.. code-block:: text - - %MyVar = uninitialized global { [40 x i32 ] } + @MyVar = external global { i32, [40 x i32 ] } ... - %idx = getelementptr { [40 x i32] }, { [40 x i32] }*, i64 0, i32 0, i64 17 + %idx = getelementptr { [40 x i32] }, ptr @MyVal, i64 0, i32 1, i64 17 -then everything works fine. In this case, the structure does not contain a -pointer and the GEP instruction can index through the global variable, into the -first field of the structure and access the 18th ``i32`` in the array there. +In this case, the structure does not contain a pointer and the GEP instruction +can index through the global variable, into the second field of the structure +and access the 18th ``i32`` in the array there. Why don't GEP x,0,0,1 and GEP x,1 alias? ---------------------------------------- @@ -226,17 +217,15 @@ .. code-block:: llvm - %MyVar = global { [10 x i32] } - %idx1 = getelementptr { [10 x i32] }, { [10 x i32] }* %MyVar, i64 0, i32 0, i64 1 - %idx2 = getelementptr { [10 x i32] }, { [10 x i32] }* %MyVar, i64 1 + @MyVar = external global { [10 x i32] } + %idx1 = getelementptr { [10 x i32] }, ptr @MyVar, i64 0, i32 0, i64 1 + %idx2 = getelementptr { [10 x i32] }, ptr @MyVar, i64 1 In this example, ``idx1`` computes the address of the second integer in the -array that is in the structure in ``%MyVar``, that is ``MyVar+4``. The type of -``idx1`` is ``i32*``. However, ``idx2`` computes the address of *the next* -structure after ``%MyVar``. The type of ``idx2`` is ``{ [10 x i32] }*`` and its -value is equivalent to ``MyVar + 40`` because it indexes past the ten 4-byte -integers in ``MyVar``. Obviously, in such a situation, the pointers don't -alias. +array that is in the structure in ``@MyVar``, that is ``MyVar+4``. However, +``idx2`` computes the address of *the next* structure after ``@MyVar``, that is +``MyVar+40``, because it indexes past the ten 4-byte integers in ``MyVar``. +Obviously, in such a situation, the pointers don't alias. Why do GEP x,1,0,0 and GEP x,1 alias? ------------------------------------- @@ -244,18 +233,16 @@ Quick Answer: They compute the same address location. These two GEP instructions will compute the same address because indexing -through the 0th element does not change the address. However, it does change the -type. Consider this example: +through the 0th element does not change the address. Consider this example: .. code-block:: llvm - %MyVar = global { [10 x i32] } - %idx1 = getelementptr { [10 x i32] }, { [10 x i32] }* %MyVar, i64 1, i32 0, i64 0 - %idx2 = getelementptr { [10 x i32] }, { [10 x i32] }* %MyVar, i64 1 + @MyVar = global { [10 x i32] } + %idx1 = getelementptr { [10 x i32] }, ptr @MyVar, i64 1, i32 0, i64 0 + %idx2 = getelementptr { [10 x i32] }, ptr @MyVar, i64 1 -In this example, the value of ``%idx1`` is ``%MyVar+40`` and its type is -``i32*``. The value of ``%idx2`` is also ``MyVar+40`` but its type is ``{ [10 x -i32] }*``. +In this example, the value of ``%idx1`` is ``MyVar+40``, and the value of +``%idx2`` is also ``MyVar+40``. Can GEP index into vector elements? ----------------------------------- @@ -447,8 +434,8 @@ As such, there are some ramifications of this for inbounds GEPs: scales implied by array/vector/pointer indices are always known to be "nsw" since they are signed values that are scaled by the element size. These values are also -allowed to be negative (e.g. "``gep i32 *%P, i32 -1``") but the pointer itself -is logically treated as an unsigned value. This means that GEPs have an +allowed to be negative (e.g. "``gep i32, ptr %P, i32 -1``") but the pointer +itself is logically treated as an unsigned value. This means that GEPs have an asymmetric relation between the pointer base (which is treated as unsigned) and the offset applied to it (which is treated as signed). The result of the additions within the offset calculation cannot have signed overflow, but when