Index: docs/ScalableVectorType.rst =================================================================== --- /dev/null +++ docs/ScalableVectorType.rst @@ -0,0 +1,138 @@ +============================================================= +Extending VectorType to support scalable vector architectures +============================================================= + +To represent a vector of unknown length a boolean `Scalable` property has been +added to the `VectorType` class, which indicates that the number of elements in +the vector is a runtime-determined integer multiple of the `NumElements` field. +Most code that deals with vectors doesn't need to know the exact length, but +does need to know relative lengths -- e.g. get a vector with the same number of +elements but a different element type, or with half or double the number of +elements. + +In order to allow code to transparently support scalable vectors, we introduce +an `ElementCount` class with two members: + +- `unsigned Min`: the minimum number of elements. +- `bool Scalable`: is the element count an unknown multiple of `Min`? + +For non-scalable vectors (``Scalable=false``) the scale is considered to be +equal to one and thus `Min` represents the exact number of elements in the +vector. + +The intent for code working with vectors is to use convenience methods and avoid +directly dealing with the number of elements. If needed, calling +`getElementCount` on a vector type instead of `getVectorNumElements` can be used +to obtain the (potentially scalable) number of elements. Overloaded division and +multiplication operators allow an ElementCount instance to be used in much the +same manner as an integer for most cases. + +This mixture of compile-time and runtime quantities allow us to reason about the +relationship between different scalable vector types without knowing their +exact length. + +The runtime multiple is not expected to change during program execution for SVE +or the SX Aurora architecture; execution behaviour is undefined for SVE in that +case, as the upper register contents (past 128b) are not architecturally defined +after a change of vector length in privileged code. + +For the RISC-V V extension, it may change on a per-function basis, which +precludes interprocedural optimizations. I leave working out the details of any +future extensions to scalable vector types in this manner to those working on +RVV. + +Allowed Operations +------------------ + +All operations allowed on first class types will work on scalable vector types, +including loads, stores, and PHI operations. + +Scalable vector Values can be spilled to/filled from the stack, with the details +on stack frame layout left to the target. + +Restrictions +------------ + +Global variables cannot be scalable vector types, because they need to be a +fixed size in the appropriate section of a resulting binary. + +Scalable vector types cannot be members of StructType or ArrayType aggregates +because they are defined to be fixed size; while they could be extended as +well, that isn't required to support scalable vectors and would need more +code changes. + +Supporting scalable vectors in C structs (as in Arm's proposed ACLE for SVE) +will still be possible by having clang lower the struct to a plain pointer +with offsets scaled by the runtime multiple. + +Legalization +------------ + +To legalize a scalable vector IR type to SelectionDAG types, the same procedure +is used as for fixed-length vectors, with one minor difference: + + - If the target does not support scalable vectors, the runtime multiple is + assumed to be a constant '1' and the scalable flag is dropped. Legalization + proceeds as normal after this. + +IR Textual Form +--------------- + +The textual form for a scalable vector is: + +`` x >`` + +where `type` is the scalar type of each element, `n` is the minimum number of +elements, and the string literal `scalable` indicates that the total number of +elements is an unknown multiple of `n`; `scalable` is just an arbitrary choice +for indicating that the vector is scalable, and could be substituted by another. +For fixed-length vectors, the `scalable` is omitted, so there is no change in +the format for existing vectors. + +Scalable vectors with the same `Min` value have the same number of elements, and +the same number of bytes if `Min * sizeof(type)` is the same (assuming they are +used within the same function): + +```` and ```` have the same number of + elements. + +```` and ```` have the same number of + bytes. + +IR Bitcode Form +--------------- + +To serialize scalable vectors to bitcode, a new boolean field is added to the +type record if the vector is scalable, and is omitted if fixed-length. This +preserves backwards compatibility with existing bitcode. + +Size Queries +------------ + +This is a proposal for how to deal with querying the size of scalable types for +analysis of IR. + +For current IR types that have a known size, all query functions return a single +integer constant. For scalable types a second integer is needed to indicate the +number of bytes/bits which need to be scaled by the runtime multiple to obtain +the actual length. + +For primitive types, `getPrimitiveSizeInBits()` will function as it does today, +except that if the type is a VectorType marked as Scalable, it will assert. + +A new function `getScalableSizeInBits()` will be added, which returns a struct +containing an integer to represent the minimum size and a boolean flag to +indicate that the minimum size is scaled by the runtime multiple. For backends +that do not need to deal with scalable types the existing methods will suffice, +but an assert will be added to them to ensure they aren't used on scalable +types. This will reduce the number of code changes required. + +Similar functionality will be added to DataLayout, and the struct definition +will be shared between them. + +Comparisons between unscaled-only or scaled-only sizes will work as expected, +and convenience operators will be provided. For now, if unscaled sizes are +compared against scaled sizes, the comparison operator will assert. This +restriction may be relaxed in future if a valid use case is found. + +Comparisons between scaled sizes with different runtime multiples is invalid.