Index: clang/docs/BitIntABI.rst =================================================================== --- /dev/null +++ clang/docs/BitIntABI.rst @@ -0,0 +1,176 @@ +=============================== +_BitInt Clang ABI Specification +=============================== + +.. contents:: + :local: + +History +======= + +* **version 1** 2021/10/11 + + * Initial document + +This document provides a generic ABI specification for the C23 ``_BitInt`` +feature, expressed in terms of concepts common to low-level platform ABIs. This +document does not directly constrain any platform, as the ABI used for +``_BitInt`` on a platform is determined by that platform's ABI specification. +It is expected that platform ABIs will codify the ABI for the ``_BitInt`` +feature on their platform using this specification, either by reference or by +copying the text of this specification into the platform ABI specification. + +Platform ABIs which include this specification by reference should explicitly +give a version and the parameters listed below. For example: "The ABI for +``_BitInt`` types is specified by version 1 of the `_BitInt Clang ABI +Specification `_, using ``64`` as +the value for ``MaxFundamentalWidth`` and ``long`` as the type for +``chunk_t``." + +High Level +========== + +_BitInt describes a family of related bit-precise integer types standardized in +C23. The original proposal is +`WG14 N2763 `_ +A ``_BitInt`` type is parameterized to specify how wide the type is (how many +bits it occupies), including the sign bit. For example, ``_BitInt(2)`` is a +signed integer with one sign bit and one value bit, while +``unsigned _BitInt(2)`` is an unsigned integer with two value bits. A +``signed _BitInt`` must be at least two bits wide, and an ``unsigned _BitInt`` +must be at least one bit wide, so a `_BitInt` must have at least one value bit. +There is an implementation-defined limit to the widest width supported, +specified by the ``BITINT_MAXWIDTH`` macro in ````. + +Unlike other integer types, bit-precise integer types do not undergo default +integer promotion, including when passed as a variadic argument. This means +that a ``_BitInt(8)`` does not promote to ``int`` when passed as an argument to +a function or returned as a value from a function. + +``_BitInt`` types are ordinary object types and may be used anywhere an object +type can be, such as a ``struct`` field, ``union`` field, or array element. In +addition, they are integer types and may be used as the type of a bit-field. +Like any other type, a ``_BitInt`` object may require more memory than its +stated bit width in order to satisfy the requirements of byte (or higher) +alignment. In other words, the width of a ``_BitInt`` affects the semantics of +operations on the value; it is not a guarantee that the value will be "packed" +into exactly that many bits in memory. + +ABI Description +=============== + +The ABI of ``_BitInt`` is expected to vary between architectures, but the +following is a general ABI outline. + +Definitions +----------- + +This generic ABI is described in terms of the following parameters, which must +be determined by the platform ABI: + +``MaxFundamentalWidth`` is the bit-width of the largest fundamental integer +type for the target that can be used to represent a ``_BitInt``. Typically, +this will be the largest integer type supported by the ABI, but a smaller limit +is also acceptable. Once this limit is chosen for an ABI, it should not be +modified later even if the ABI adds support for a larger fundamental integer +type. + +``chunk_t`` is the type of the fundamental integer type that the target will +use to store the components of a ``_BitInt`` that is wider than +``MaxFundamentalWidth``. This should be a fundamental integer type for which +the target supports overflow operations and will typically be the full width of +a general purpose register. + +These parameters are used to derive other relevant properties as described +below. + +Object Layout (excluding bit-fields) +------------------------------------ + +``ChunkWidth`` is defined as ``sizeof(chunk_t) * CHAR_BITS``. + +``RepWidth(N)`` is the *representation width* of a ``_BitInt`` of width ``N``. +If ``N <= MaxFundamentalWidth``, then ``RepWidth(N)`` is the bit-width of the +lowest-rank fundamental integer type (excluding ``_Bool``) with at least ``N`` +bits. Otherwise, ``RepWidth(N)`` is the least multiple of the bit-width of +``chunk_t`` which is at least ``N``. It is therefore always true that ``N <= +RepWidth(N)``. When ``N < RepWidth(N)``, a ``_BitInt(N)`` is represented +exactly as if it were extended to a ``_BitInt(RepWidth(N))`` of the same +signedness. The value is held in the least significant bits, and the excess +(most significant) bits have unspecified values. + +If ``RepWidth(N) <= MaxFundamentalWidth``, then ``signed _BitInt(N)`` and +``unsigned _BitInt(N)`` have the same representation width as the smallest +fundamental integer type of at least ``RepWidth(N)`` bits with the same sign, +excluding ``_Bool``. + +Otherwise, ``signed _BitInt(N)`` and ``unsigned _BitInt(N)`` have the same +representation width as a ``struct`` containing a single member of type +``chunk_t[RepWidth(N) / sizeof(chunk_t)]``. The element at index ``Idx`` stores +bits ``Idx * ChunkWidth`` through ``(Idx + 1) * ChunkWidth``. That is, the +array is always stored in little-endian order, which should have better memory- +access properties regardless of the target endianness. Individual elements of +this array will still have natural host endianness. + +Bit-Field Layout +---------------- + +When a ``_BitInt(N)`` is used as the type of a bit-field, the width of the +allocation unit is ``RepWidth(N)``. e.g., ``_BitInt(9) i : 8;`` occupies a two- +byte allocation unit, not a one-byte allocation unit. The unused bits from the +``_BitInt`` should be packed together with an adjacent bit-field of a non- +``_BitInt`` type. ``_BitInt(N)`` and ``_BitInt(M)`` types should pack together +when ``N != M``. + +Passing and Returning an Object +------------------------------- + +When a ``_BitInt(N)`` is passed as an argument to a function or returned from a +function, + +* if ``RepWidth(N) <= MaxFundamentalWidth``, the object is passed to the + function or returned from the function in the same manner as an integer + object with width ``RepWidth(N)``, + +* otherwise, the object is passed to the function or returned from the function + in the same manner as a ``struct`` containing a single member of type + ``chunk_t[RepWidth(N) / sizeof(chunk_t)]``. + + +Rationale and Alternative Approaches +==================================== + +Target architectures may have different needs which are not met by the above +generic ABI specification. This section contains information about some of the +decisions in the generic ABI and alternative approaches that can be considered. + +Excess Bits +----------- + +When ``N < RepWidth(N)``, the ABI has three natural alternatives: + +* The value is kept in the least-significant bits and the excess (most + significant) bits are unspecified. + +* The value is kept in the least-significant bits and the excess (most + significant) bits are required to be a proper zero- or sign-extension of the + value (as appropriate for the signedness of the type). + +* The value is left-shifted into the most-significant bits and the excess + (least significant) bits are required to be zero. + +Each of these has tradeoffs. Leaving the most-significant bits unspecified +allows addition, subtraction, multiplication, bitwise complement, left shift, +and narrowing conversions to avoid adjusting these bits in their results. +Forcing the most-significant bits to be properly extended allows comparison, +division, right shift, and widening conversions to avoid adjusting these bits +in their operands. Keeping the value left-shifted is good for both addition and +comparison, but other operations (especially conversions) become more complex, +and the representation is less "natural", which can complicate interacting with +other systems. Furthermore, having unspecified bits means that bitwise equality +can be false even when semantic equality holds, but not having unspecified bits +means that there are trap representations which can lead to undefined behavior. + +This ABI leaves the most-significant bits unspecified out of a belief that +doing so should optimize the most common operations and avoid the most +complexity in practice. \ No newline at end of file Index: clang/docs/index.rst =================================================================== --- clang/docs/index.rst +++ clang/docs/index.rst @@ -97,6 +97,7 @@ ItaniumMangleAbiTags HardwareAssistedAddressSanitizerDesign.rst ConstantInterpreter + BitIntABI Indices and tables