diff --git a/llvm/docs/OpaquePointers.rst b/llvm/docs/OpaquePointers.rst new file mode 100644 --- /dev/null +++ b/llvm/docs/OpaquePointers.rst @@ -0,0 +1,101 @@ +=============== +Opaque Pointers +=============== + +The Opaque Pointer Type +======================= + +Traditionally, LLVM IR pointer types contain a pointee type. For example, +``i32*`` is a pointer that points to an ``i32`` somewhere in memory. And in the +past, some instructions would look at the pointee type to determine what type to +work with. For example, a load would look at the pointer operand's pointee type +to determine what type to load from memory. However, ultimately LLVM memory is +not typed, and there are issues with having explicit pointee types. + +LLVM IR pointers can be cast back and forth between pointers with different +pointee types. The pointee type does not necessarily actually represent the +actual underlying type in memory. In other words, the pointee type contains no +real semantics. + +Lots of operations do not actually care about the underlying type. These +operations, typically intrinsics, end up taking an ``i8 *``. This causes lots of +redundant bitcasts in the IR. The extra bitcasts take up space and are extra +work to look through in optimizations. And more bitcasts increase the chances of +incorrect bitcasts, especially in regards to address spaces. + +Since LLVM IR works on untyped memory, for a frontend to tell LLVM about +frontend types for the purposes of alias analysis, extra metadata is added to +the IR. For more information, see `TBAA `_. + +The frontend should already know what type each operation operates on based on +the input source code. However, frontends like Clang may end up relying on LLVM +pointer pointee types to keep track of pointer types. The frontend needs to make +sure keep track of pointee types on its own. + +Some instructions still need to know what type to treat the memory pointed to by +the pointer as. For example, a load needs to know how many bytes to load from +memory. In these cases, instructions themselves contain a type argument. For +example the load instruction from older versions of LLVM + +.. code-block:: llvm + + load i64* %p + +becomes + +.. code-block:: llvm + + load i64, ptr %p + +This is fairly similar to how there is no distinction between signed and +unsigned integer types, rather the integer operations themselves contain what to +treat the integer as. Initially, LLVM IR distinguished between unsigned and +signed integer types. The transition from manifesting signedness in types to +instructions happened early on in LLVM's life to the betterment of LLVM IR. + +I Still Need Pointee Types! +=========================== + +Some places still need to know what type a pointer types to. For the most part, +this is codegen and ABI specific. For example, `byval +`_ arguments are pointers, but backends need +to know the underlying type of the argument to properly lower it. In cases like +these, the attributes contain a type argument. For example, + +.. code-block:: llvm + + call void @f(ptr byval(i32) %p) + +signifies that ``%p`` as an argument should be lowered as an ``i32`` passed +indirectly. + +If you have use cases that this sort of fix doesn't cover, please email +llvm-dev. + +Transition Plan +=============== + +Making this change in one huge commit is infeasible. This needs to be done +incrementally. The following steps need to be done, in no particular order: + +* Introduce the opaque pointer type + +* Various ABI attributes and instructions that need a type can be changed one at + a time + + * This has already happened for many instructions like loads, stores, GEPs, + and various attributes like ``byval`` + +* Fix up existing in-tree users of pointee types to not rely on LLVM pointer + pointee types + +* Allow bitcode auto-upgrade of legacy pointer type to the new opaque pointer + type (not to be turned on until ready) + +* Migrate frontends to not keep track of frontend pointee types via LLVM pointer + pointee types + +* Add option to internally treat all pointer types opaque pointers and see what + breaks, starting with LLVM tests, then run Clang over large codebases + +* Replace legacy pointer types in LLVM tests with opaque pointer types diff --git a/llvm/docs/UserGuides.rst b/llvm/docs/UserGuides.rst --- a/llvm/docs/UserGuides.rst +++ b/llvm/docs/UserGuides.rst @@ -44,6 +44,7 @@ MergeFunctions MCJITDesignAndImplementation ORCv2 + OpaquePointers JITLink NewPassManager NVPTXUsage