Index: docs/AMDGPUUsage.rst =================================================================== --- docs/AMDGPUUsage.rst +++ docs/AMDGPUUsage.rst @@ -661,23 +661,20 @@ ``desc`` field size is a multiple of 4 bytes. The ``sh_addralign`` field of the ``.note`` section must be at least 4 to indicate at least 8 byte alignment. -The AMDGPU backend code object uses the following ELF note records in the -``.note`` section. The *Description* column specifies the layout of the note -record's ``desc`` field. All fields are consecutive bytes. Note records with -variable size strings have a corresponding ``*_size`` field that specifies the -number of bytes, including the terminating null character, in the string. The -string(s) come immediately after the preceding fields. +The AMDGPU backend code object uses the following ELF note record in the +``.note`` section. Additional note records can be present. .. table:: AMDGPU ELF Note Records :name: amdgpu-elf-note-records-table - ===== ============================== ====================================== - Name Type Description - ===== ============================== ====================================== - "AMD" ``NT_AMD_AMDGPU_HSA_METADATA`` - ===== ============================== ====================================== + ======== ============================== ====================================== + Name Type Description + ======== ============================== ====================================== + "AMDGPU" ``NT_AMDGPU_METADATA`` Metadata in Message Pack [MsgPack]_ + binary format. + ======== ============================== ====================================== .. @@ -687,17 +684,15 @@ ============================== ===== Name Value ============================== ===== - *reserved* 0-9 - ``NT_AMD_AMDGPU_HSA_METADATA`` 10 - *reserved* 11 + *reserved* 0-5 + ``NT_AMDGPU_METADATA`` 6 ============================== ===== -``NT_AMD_AMDGPU_HSA_METADATA`` - Specifies extensible metadata associated with the code objects executed on HSA - [HSA]_ compatible runtimes such as AMD's ROCm [AMD-ROCm]_. It is required when - the target triple OS is ``amdhsa`` (see :ref:`amdgpu-target-triples`). See - :ref:`amdgpu-amdhsa-code-object-metadata` for the syntax of the code - object metadata string. +``NT_AMDGPU_METADATA`` + Specifies extensible metadata associated with an AMDGPU code + object. It is encoded as a map in the Message Pack [MsgPack]_ binary + data format. See :ref:`amdgpu-amdhsa-code-object-metadata` for the + map keys defined for the ``amdhsa`` OS. .. _amdgpu-symbols: @@ -991,7 +986,7 @@ The code object metadata specifies extensible metadata associated with the code objects executed on HSA [HSA]_ compatible runtimes such as AMD's ROCm -[AMD-ROCm]_. It is specified by the ``NT_AMD_AMDGPU_HSA_METADATA`` note record +[AMD-ROCm]_. It is specified by the ``NT_AMD_AMDGPU_METADATA`` note record (see :ref:`amdgpu-note-records`) and is required when the target triple OS is ``amdhsa`` (see :ref:`amdgpu-target-triples`). It must contain the minimum information necessary to support the ROCM kernel queries. For example, the @@ -999,421 +994,405 @@ runtime may require other information to be included. For example, the AMD OpenCL runtime records kernel argument information. -The metadata is specified as a YAML formatted string (see [YAML]_ and -:doc:`YamlIO`). +The metadata is represented as Message Pack formatted binary data (see +[MsgPack]_). The top level is a Message Pack map that includes the +keys defined in table +:ref:`amdgpu-amdhsa-code-object-metadata-map-table` and referenced +tables. -.. TODO - Is the string null terminated? It probably should not if YAML allows it to - contain null characters, otherwise it should be. - -The metadata is represented as a single YAML document comprised of the mapping -defined in table :ref:`amdgpu-amdhsa-code-object-metadata-mapping-table` and -referenced tables. - -For boolean values, the string values of ``false`` and ``true`` are used for -false and true respectively. - -Additional information can be added to the mappings. To avoid conflicts, any -non-AMD key names should be prefixed by "*vendor-name*.". - - .. table:: AMDHSA Code Object Metadata Mapping - :name: amdgpu-amdhsa-code-object-metadata-mapping-table - - ========== ============== ========= ======================================= - String Key Value Type Required? Description - ========== ============== ========= ======================================= - "Version" sequence of Required - The first integer is the major - 2 integers version. Currently 1. - - The second integer is the minor - version. Currently 0. - "Printf" sequence of Each string is encoded information - strings about a printf function call. The - encoded information is organized as - fields separated by colon (':'): - - ``ID:N:S[0]:S[1]:...:S[N-1]:FormatString`` - - where: - - ``ID`` - A 32 bit integer as a unique id for - each printf function call - - ``N`` - A 32 bit integer equal to the number - of arguments of printf function call - minus 1 - - ``S[i]`` (where i = 0, 1, ... , N-1) - 32 bit integers for the size in bytes - of the i-th FormatString argument of - the printf function call - - FormatString - The format string passed to the - printf function call. - "Kernels" sequence of Required Sequence of the mappings for each - mapping kernel in the code object. See - :ref:`amdgpu-amdhsa-code-object-kernel-metadata-mapping-table` - for the definition of the mapping. - ========== ============== ========= ======================================= - -.. +Additional information can be added to the maps. To avoid conflicts, +any key names should be prefixed by "*vendor-name*." where +``vendor-name`` can be the the name of the vendor and specific vendor +tool that generates the information. The prefix is abbreviated to +simply "." when it appears within a map that has been added by the +same *vendor-name*. - .. table:: AMDHSA Code Object Kernel Metadata Mapping - :name: amdgpu-amdhsa-code-object-kernel-metadata-mapping-table + .. table:: AMDHSA Code Object Metadata Map + :name: amdgpu-amdhsa-code-object-metadata-map-table - ================= ============== ========= ================================ + ================= ============== ========= ======================================= String Key Value Type Required? Description - ================= ============== ========= ================================ - "Name" string Required Source name of the kernel. - "SymbolName" string Required Name of the kernel - descriptor ELF symbol. - "Language" string Source language of the kernel. - Values include: - - - "OpenCL C" - - "OpenCL C++" - - "HCC" - - "OpenMP" - - "LanguageVersion" sequence of - The first integer is the major - 2 integers version. - - The second integer is the - minor version. - "Attrs" mapping Mapping of kernel attributes. - See - :ref:`amdgpu-amdhsa-code-object-kernel-attribute-metadata-mapping-table` - for the mapping definition. - "Args" sequence of Sequence of mappings of the - mapping kernel arguments. See - :ref:`amdgpu-amdhsa-code-object-kernel-argument-metadata-mapping-table` - for the definition of the mapping. - "CodeProps" mapping Mapping of properties related to - the kernel code. See - :ref:`amdgpu-amdhsa-code-object-kernel-code-properties-metadata-mapping-table` - for the mapping definition. - ================= ============== ========= ================================ - -.. - - .. table:: AMDHSA Code Object Kernel Attribute Metadata Mapping - :name: amdgpu-amdhsa-code-object-kernel-attribute-metadata-mapping-table - - =================== ============== ========= ============================== - String Key Value Type Required? Description - =================== ============== ========= ============================== - "ReqdWorkGroupSize" sequence of If not 0, 0, 0 then all values - 3 integers must be >=1 and the dispatch - work-group size X, Y, Z must - correspond to the specified - values. Defaults to 0, 0, 0. - - Corresponds to the OpenCL - ``reqd_work_group_size`` - attribute. - "WorkGroupSizeHint" sequence of The dispatch work-group size - 3 integers X, Y, Z is likely to be the - specified values. - - Corresponds to the OpenCL - ``work_group_size_hint`` - attribute. - "VecTypeHint" string The name of a scalar or vector - type. - - Corresponds to the OpenCL - ``vec_type_hint`` attribute. - - "RuntimeHandle" string The external symbol name - associated with a kernel. - OpenCL runtime allocates a - global buffer for the symbol - and saves the kernel's address - to it, which is used for - device side enqueueing. Only - available for device side - enqueued kernels. - =================== ============== ========= ============================== + ================= ============== ========= ======================================= + "amdhsa.version" sequence of Required - The first integer is the major + 2 integers version. Currently 1. + - The second integer is the minor + version. Currently 0. + "amdhsa.printf" sequence of Each string is encoded information + strings about a printf function call. The + encoded information is organized as + fields separated by colon (':'): + + ``ID:N:S[0]:S[1]:...:S[N-1]:FormatString`` + + where: + + ``ID`` + A 32 bit integer as a unique id for + each printf function call + + ``N`` + A 32 bit integer equal to the number + of arguments of printf function call + minus 1 + + ``S[i]`` (where i = 0, 1, ... , N-1) + 32 bit integers for the size in bytes + of the i-th FormatString argument of + the printf function call + + FormatString + The format string passed to the + printf function call. + "amdhsa.kernels" sequence of Required Sequence of the maps for each + map kernel in the code object. See + :ref:`amdgpu-amdhsa-code-object-kernel-metadata-map-table` + for the definition of the keys included + in that map. + ================= ============== ========= ======================================= .. - .. table:: AMDHSA Code Object Kernel Argument Metadata Mapping - :name: amdgpu-amdhsa-code-object-kernel-argument-metadata-mapping-table - - ================= ============== ========= ================================ - String Key Value Type Required? Description - ================= ============== ========= ================================ - "Name" string Kernel argument name. - "TypeName" string Kernel argument type name. - "Size" integer Required Kernel argument size in bytes. - "Align" integer Required Kernel argument alignment in - bytes. Must be a power of two. - "ValueKind" string Required Kernel argument kind that - specifies how to set up the - corresponding argument. - Values include: - - "ByValue" - The argument is copied - directly into the kernarg. - - "GlobalBuffer" - A global address space pointer - to the buffer data is passed - in the kernarg. - - "DynamicSharedPointer" - A group address space pointer - to dynamically allocated LDS - is passed in the kernarg. - - "Sampler" - A global address space - pointer to a S# is passed in - the kernarg. - - "Image" - A global address space - pointer to a T# is passed in - the kernarg. - - "Pipe" - A global address space pointer - to an OpenCL pipe is passed in - the kernarg. - - "Queue" - A global address space pointer - to an OpenCL device enqueue - queue is passed in the - kernarg. - - "HiddenGlobalOffsetX" - The OpenCL grid dispatch - global offset for the X - dimension is passed in the - kernarg. - - "HiddenGlobalOffsetY" - The OpenCL grid dispatch - global offset for the Y - dimension is passed in the - kernarg. - - "HiddenGlobalOffsetZ" - The OpenCL grid dispatch - global offset for the Z - dimension is passed in the - kernarg. - - "HiddenNone" - An argument that is not used - by the kernel. Space needs to - be left for it, but it does - not need to be set up. - - "HiddenPrintfBuffer" - A global address space pointer - to the runtime printf buffer - is passed in kernarg. - - "HiddenDefaultQueue" - A global address space pointer - to the OpenCL device enqueue - queue that should be used by - the kernel by default is - passed in the kernarg. - - "HiddenCompletionAction" - A global address space pointer - to help link enqueued kernels into - the ancestor tree for determining - when the parent kernel has finished. - - "ValueType" string Required Kernel argument value type. Only - present if "ValueKind" is - "ByValue". For vector data - types, the value is for the - element type. Values include: - - - "Struct" - - "I8" - - "U8" - - "I16" - - "U16" - - "F16" - - "I32" - - "U32" - - "F32" - - "I64" - - "U64" - - "F64" - - .. TODO - How can it be determined if a - vector type, and what size - vector? - "PointeeAlign" integer Alignment in bytes of pointee - type for pointer type kernel - argument. Must be a power - of 2. Only present if - "ValueKind" is - "DynamicSharedPointer". - "AddrSpaceQual" string Kernel argument address space - qualifier. Only present if - "ValueKind" is "GlobalBuffer" or - "DynamicSharedPointer". Values - are: - - - "Private" - - "Global" - - "Constant" - - "Local" - - "Generic" - - "Region" - - .. TODO - Is GlobalBuffer only Global - or Constant? Is - DynamicSharedPointer always - Local? Can HCC allow Generic? - How can Private or Region - ever happen? - "AccQual" string Kernel argument access - qualifier. Only present if - "ValueKind" is "Image" or - "Pipe". Values - are: - - - "ReadOnly" - - "WriteOnly" - - "ReadWrite" - - .. TODO - Does this apply to - GlobalBuffer? - "ActualAccQual" string The actual memory accesses - performed by the kernel on the - kernel argument. Only present if - "ValueKind" is "GlobalBuffer", - "Image", or "Pipe". This may be - more restrictive than indicated - by "AccQual" to reflect what the - kernel actual does. If not - present then the runtime must - assume what is implied by - "AccQual" and "IsConst". Values - are: - - - "ReadOnly" - - "WriteOnly" - - "ReadWrite" - - "IsConst" boolean Indicates if the kernel argument - is const qualified. Only present - if "ValueKind" is - "GlobalBuffer". - - "IsRestrict" boolean Indicates if the kernel argument - is restrict qualified. Only - present if "ValueKind" is - "GlobalBuffer". - - "IsVolatile" boolean Indicates if the kernel argument - is volatile qualified. Only - present if "ValueKind" is - "GlobalBuffer". - - "IsPipe" boolean Indicates if the kernel argument - is pipe qualified. Only present - if "ValueKind" is "Pipe". - - .. TODO - Can GlobalBuffer be pipe - qualified? - ================= ============== ========= ================================ + .. table:: AMDHSA Code Object Kernel Metadata Map + :name: amdgpu-amdhsa-code-object-kernel-metadata-map-table + + =================================== ============== ========= ================================ + String Key Value Type Required? Description + =================================== ============== ========= ================================ + ".name" string Required Source name of the kernel. + ".symbol" string Required Name of the kernel + descriptor ELF symbol. + ".language" string Source language of the kernel. + Values include: + + - "OpenCL C" + - "OpenCL C++" + - "HCC" + - "HIP" + - "OpenMP" + - "Assembler" + + ".language_version" sequence of - The first integer is the major + 2 integers version. + - The second integer is the + minor version. + ".args" sequence of Sequence of maps of the + map kernel arguments. See + :ref:`amdgpu-amdhsa-code-object-kernel-argument-metadata-map-table` + for the definition of the keys + included in that map. + ".reqd_workgroup_size" sequence of If not 0, 0, 0 then all values + 3 integers must be >=1 and the dispatch + work-group size X, Y, Z must + correspond to the specified + values. Defaults to 0, 0, 0. + + Corresponds to the OpenCL + ``reqd_work_group_size`` + attribute. + ".workgroup_size_hint" sequence of The dispatch work-group size + 3 integers X, Y, Z is likely to be the + specified values. + + Corresponds to the OpenCL + ``work_group_size_hint`` + attribute. + ".vec_type_hint" string The name of a scalar or vector + type. + + Corresponds to the OpenCL + ``vec_type_hint`` attribute. + + ".device_enqueue_symbol" string The external symbol name + associated with a kernel. + OpenCL runtime allocates a + global buffer for the symbol + and saves the kernel's address + to it, which is used for + device side enqueueing. Only + available for device side + enqueued kernels. + ".kernarg_segment_size" integer Required The size in bytes of + the kernarg segment + that holds the values + of the arguments to + the kernel. + ".group_segment_fixed_size" integer Required The amount of group + segment memory + required by a + work-group in + bytes. This does not + include any + dynamically allocated + group segment memory + that may be added + when the kernel is + dispatched. + ".private_segment_fixed_size" integer Required The amount of fixed + private address space + memory required for a + work-item in + bytes. If the kernel + uses a dynamic call + stack then additional + space must be added + to this value for the + call stack. + ".kernarg_segment_align" integer Required The maximum byte + alignment of + arguments in the + kernarg segment. Must + be a power of 2. + ".wavefront_size" integer Required Wavefront size. Must + be a power of 2. + ".sgpr_count" integer Required Number of scalar + registers required by a + wavefront for + GFX6-GFX9. A register + is required if it is + used explicitly, or + if a higher numbered + register is used + explicitly. This + includes the special + SGPRs for VCC, Flat + Scratch (GFX7-GFX9) + and XNACK (for + GFX8-GFX9). It does + not include the 16 + SGPR added if a trap + handler is + enabled. It is not + rounded up to the + allocation + granularity. + ".vgpr_count" integer Required Number of vector + registers required by + each work-item for + GFX6-GFX9. A register + is required if it is + used explicitly, or + if a higher numbered + register is used + explicitly. + ".max_flat_workgroup_size" integer Required Maximum flat + work-group size + supported by the + kernel in work-items. + Must be >=1 and + consistent with + ReqdWorkGroupSize if + not 0, 0, 0. + ".sgpr_spill_count" integer Number of stores from + a scalar register to + a register allocator + created spill + location. + ".vgpr_spill_count" integer Number of stores from + a vector register to + a register allocator + created spill + location. + =================================== ============== ========= ================================ .. - .. table:: AMDHSA Code Object Kernel Code Properties Metadata Mapping - :name: amdgpu-amdhsa-code-object-kernel-code-properties-metadata-mapping-table - - ============================ ============== ========= ===================== - String Key Value Type Required? Description - ============================ ============== ========= ===================== - "KernargSegmentSize" integer Required The size in bytes of - the kernarg segment - that holds the values - of the arguments to - the kernel. - "GroupSegmentFixedSize" integer Required The amount of group - segment memory - required by a - work-group in - bytes. This does not - include any - dynamically allocated - group segment memory - that may be added - when the kernel is - dispatched. - "PrivateSegmentFixedSize" integer Required The amount of fixed - private address space - memory required for a - work-item in - bytes. If the kernel - uses a dynamic call - stack then additional - space must be added - to this value for the - call stack. - "KernargSegmentAlign" integer Required The maximum byte - alignment of - arguments in the - kernarg segment. Must - be a power of 2. - "WavefrontSize" integer Required Wavefront size. Must - be a power of 2. - "NumSGPRs" integer Required Number of scalar - registers used by a - wavefront for - GFX6-GFX9. This - includes the special - SGPRs for VCC, Flat - Scratch (GFX7-GFX9) - and XNACK (for - GFX8-GFX9). It does - not include the 16 - SGPR added if a trap - handler is - enabled. It is not - rounded up to the - allocation - granularity. - "NumVGPRs" integer Required Number of vector - registers used by - each work-item for - GFX6-GFX9 - "MaxFlatWorkGroupSize" integer Required Maximum flat - work-group size - supported by the - kernel in work-items. - Must be >=1 and - consistent with - ReqdWorkGroupSize if - not 0, 0, 0. - "NumSpilledSGPRs" integer Number of stores from - a scalar register to - a register allocator - created spill - location. - "NumSpilledVGPRs" integer Number of stores from - a vector register to - a register allocator - created spill - location. - ============================ ============== ========= ===================== + .. table:: AMDHSA Code Object Kernel Argument Metadata Map + :name: amdgpu-amdhsa-code-object-kernel-argument-metadata-map-table + + ====================== ============== ========= ================================ + String Key Value Type Required? Description + ====================== ============== ========= ================================ + ".name" string Kernel argument name. + ".type_name" string Kernel argument type name. + ".size" integer Required Kernel argument size in bytes. + ".offset" integer Required Kernel argument offset in + bytes. The offset must be a + multiple of the alignment + required by the argument. + ".value_kind" string Required Kernel argument kind that + specifies how to set up the + corresponding argument. + Values include: + + "by_value" + The argument is copied + directly into the kernarg. + + "global_buffer" + A global address space pointer + to the buffer data is passed + in the kernarg. + + "dynamic_shared_pointer" + A group address space pointer + to dynamically allocated LDS + is passed in the kernarg. + + "sampler" + A global address space + pointer to a S# is passed in + the kernarg. + + "image" + A global address space + pointer to a T# is passed in + the kernarg. + + "pipe" + A global address space pointer + to an OpenCL pipe is passed in + the kernarg. + + "queue" + A global address space pointer + to an OpenCL device enqueue + queue is passed in the + kernarg. + + "hidden_global_offset_x" + The OpenCL grid dispatch + global offset for the X + dimension is passed in the + kernarg. + + "hidden_global_offset_y" + The OpenCL grid dispatch + global offset for the Y + dimension is passed in the + kernarg. + + "hidden_global_offset_z" + The OpenCL grid dispatch + global offset for the Z + dimension is passed in the + kernarg. + + "hidden_none" + An argument that is not used + by the kernel. Space needs to + be left for it, but it does + not need to be set up. + + "hidden_printf_buffer" + A global address space pointer + to the runtime printf buffer + is passed in kernarg. + + "hidden_default_queue" + A global address space pointer + to the OpenCL device enqueue + queue that should be used by + the kernel by default is + passed in the kernarg. + + "hidden_completion_action" + A global address space pointer + to help link enqueued kernels into + the ancestor tree for determining + when the parent kernel has finished. + + ".value_type" string Required Kernel argument value type. Only + present if ".value_kind" is + "by_value". For vector data + types, the value is for the + element type. Values include: + + - "struct" + - "i8" + - "u8" + - "i16" + - "u16" + - "f16" + - "i32" + - "u32" + - "f32" + - "i64" + - "u64" + - "f64" + + .. TODO + How can it be determined if a + vector type, and what size + vector? + ".pointee_align" integer Alignment in bytes of pointee + type for pointer type kernel + argument. Must be a power + of 2. Only present if + ".value_kind" is + "dynamic_shared_pointer". + ".address_space" string Kernel argument address space + qualifier. Only present if + ".value_kind" is "global_buffer" or + "dynamic_shared_pointer". Values + are: + + - "private" + - "global" + - "constant" + - "local" + - "generic" + - "region" + + .. TODO + Is "global_buffer" only "global" + or "constant"? Is + "dynamic_shared_pointer" always + "local"? Can HCC allow "generic"? + How can "private" or "region" + ever happen? + ".access" string Kernel argument access + qualifier. Only present if + ".value_kind" is "image" or + "pipe". Values + are: + + - "read_only" + - "write_only" + - "read_write" + + .. TODO + Does this apply to + "global_buffer"? + ".actual_access" string The actual memory accesses + performed by the kernel on the + kernel argument. Only present if + ".value_kind" is "global_buffer", + "image", or "pipe". This may be + more restrictive than indicated + by ".access" to reflect what the + kernel actual does. If not + present then the runtime must + assume what is implied by + ".access" and ".is_const" . Values + are: + + - "read_only" + - "write_only" + - "read_write" + + ".is_const" boolean Indicates if the kernel argument + is const qualified. Only present + if ".value_kind" is + "global_buffer". + + ".is_restrict" boolean Indicates if the kernel argument + is restrict qualified. Only + present if ".value_kind" is + "global_buffer". + + ".is_volatile" boolean Indicates if the kernel argument + is volatile qualified. Only + present if ".value_kind" is + "global_buffer". + + ".is_pipe" boolean Indicates if the kernel argument + is pipe qualified. Only present + if ".value_kind" is "pipe". + + .. TODO + Can "global_buffer" be pipe + qualified? + ====================== ============== ========= ================================ .. @@ -4355,7 +4334,7 @@ - *wavefront_size* defaults to 6. - *kernarg_segment_alignment*, *group_segment_alignment*, and *private_segment_alignment* default to 4. Note that alignments are specified - as a power of two, so a value of **n** means an alignment of 2^ **n**. + as a power of 2, so a value of **n** means an alignment of 2^ **n**. The *.amd_kernel_code_t* directive must be placed immediately after the function label and before any instructions. @@ -4568,6 +4547,17 @@ :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx9-table`. ======================================================== ================ ============ =================== +.amdgpu_metadata +++++++++++++++++++++++++ + +Optional directive which declares the contents of the ``NT_AMDGPU_METADATA`` +note record (see :ref:`amdgpu-elf-note-records-table`). + +The contents must be in the [YAML]_ markup format, with the same structure and +semantics described in :ref:`amdgpu-amdhsa-code-object-metadata`. + +This directive is terminated by an ``.end_amdgpu_metadata`` directive. + Example HSA Source Code (-mattr=+code-object-v3) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -4600,6 +4590,24 @@ .amdhsa_next_free_sgpr .amdgcn.next_free_sgpr .end_amdhsa_kernel + .amdgpu_metadata + --- + amdhsa.version: + - 1 + - 0 + amdhsa.kernels: + - .name: hello_world + .symbol: hello_world.kd + .kernarg_segment_size: 48 + .group_segment_fixed_size: 0 + .private_segment_fixed_size: 0 + .kernarg_segment_align: 4 + .wavefront_size: 64 + .sgpr_count: 2 + .vgpr_count: 3 + .max_flat_workgroup_size: 256 + ... + .end_amdgpu_metadata Additional Documentation ======================== @@ -4618,6 +4626,7 @@ .. [ELF] `Executable and Linkable Format (ELF) `__ .. [DWARF] `DWARF Debugging Information Format `__ .. [YAML] `YAML Ain't Markup Language (YAML™) Version 1.2 `__ +.. [MsgPack] `Message Pack `__ .. [OpenCL] `The OpenCL Specification Version 2.0 `__ .. [HRF] `Heterogeneous-race-free Memory Models `__ .. [CLANG-ATTR] `Attributes in Clang `__