# Changeset View

# Standalone View

# clang/docs/LanguageExtensions.rst

Show First 20 Lines • Show All 500 Lines • ▼ Show 20 Lines | |||||

See also :ref:`langext-__builtin_shufflevector`, :ref:`langext-__builtin_convertvector`. | See also :ref:`langext-__builtin_shufflevector`, :ref:`langext-__builtin_convertvector`. | ||||

.. [#] ternary operator(?:) has different behaviors depending on condition | .. [#] ternary operator(?:) has different behaviors depending on condition | ||||

operand's vector type. If the condition is a GNU vector (i.e. __vector_size__), | operand's vector type. If the condition is a GNU vector (i.e. __vector_size__), | ||||

it's only available in C++ and uses normal bool conversions (that is, != 0). | it's only available in C++ and uses normal bool conversions (that is, != 0). | ||||

If it's an extension (OpenCL) vector, it's only available in C and OpenCL C. | If it's an extension (OpenCL) vector, it's only available in C and OpenCL C. | ||||

And it selects base on signedness of the condition operands (OpenCL v1.1 s6.3.9). | And it selects base on signedness of the condition operands (OpenCL v1.1 s6.3.9). | ||||

Vector Builtins | |||||

--------------- | |||||

**Note: The implementation of vector builtins is work-in-progress and incomplete.** | |||||

In addition to the operators mentioned above, Clang provides a set of builtins | |||||

to perform additional operations on certain scalar and vector types. | |||||

Let ``T`` be one of the following types: | |||||

* an integer type (as in C2x 6.2.5p19), but excluding enumerated types and _Bool | |||||

* the standard floating types float or double | |||||

* a half-precision floating point type, if one is supported on the target | |||||

* a vector type. | |||||

For scalar types, consider the operation applied to a vector with a single element. | |||||

*Elementwise Builtins* | |||||

Each builtin returns a vector equivalent to applying the specified operation | |||||

elementwise to the input. | |||||

Unless specified otherwise operation(±0) = ±0 and operation(±infinity) = ±infinity | |||||

========================================= ================================================================ ========================================= | |||||

Name Operation Supported element types | |||||

========================================= ================================================================ ========================================= | |||||

T __builtin_elementwise_abs(T x) return the absolute value of a number x; the absolute value of signed integer and floating point types | |||||

the most negative integer remains the most negative integer | |||||

T __builtin_elementwise_ceil(T x) return the smallest integral value greater than or equal to x floating point types | |||||

scanon: "Prevailing rounding mode" is not super-useful, other than as a spelling for round-to-nearest… | |||||

I removed fhahn: I removed `rint` and `round` for now and add` _roundeven` with the wording from TS 18661-1 | |||||

T __builtin_elementwise_floor(T x) return the largest integral value less than or equal to x floating point types | |||||

T __builtin_elementwise_roundeven(T x) round x to the nearest integer value in floating point format, floating point types | |||||

rounding halfway cases to even (that is, to the nearest value | |||||

that is an even integer), regardless of the current rounding | |||||

direction. | |||||

T__builtin_elementwise_trunc(T x) return the integral value nearest to but no larger in floating point types | |||||

magnitude than x | |||||

T __builtin_elementwise_max(T x, T y) return x or y, whichever is larger integer and floating point types | |||||

T __builtin_elementwise_min(T x, T y) return x or y, whichever is smaller integer and floating point types | |||||

========================================= ================================================================ ========================================= | |||||

*Reduction Builtins* | |||||

I'm not sure I understand what is being concatenated here. craig.topper: I'm not sure I understand what is being concatenated here. | |||||

I tried to spell it out more clearly. I'm still not sure if that spells it out as clearly as possibly and I'd appreciate any suggestions on how to improve the wording. fhahn: I tried to spell it out more clearly. I'm still not sure if that spells it out as clearly as… | |||||

It's unclear because there's no apparent "first" or "second" vector; there's just a single argument, and the result isn't a vector, it's a scalar. I think you want to say something like: "the operation is repeatedly applied to adjacent pairs of elements until the result is a scalar" and then provide a worked example. scanon: It's unclear because there's no apparent "first" or "second" vector; there's just a single… | |||||

Should it somehow mention the pair is the even element Should probably spell out the non-power2 behavior. Presumably we pad identity elements after the last element to widen the vector out to a power 2 and then proceed normally? craig.topper: Should it somehow mention the pair is the even element `i` and the odd element `i+1`. There are… | |||||

Thanks, I tried to update the wording to make it clear that it operates on even-odd non-overlapping pairs.
Good point, done!
Used and added an example. fhahn: > Should it somehow mention the pair is the even element i and the odd element i+1. There are n… | |||||

The input is a single vector. I'm not understanding where we get a second vector to concatenate. craig.topper: The input is a single vector. I'm not understanding where we get a second vector to concatenate. | |||||

Oh yes, now I see where the confusion was coming from. I was thinking about the reduction tree and how the input is broken up. Sorry for the confusing wording. I gave it another try, should be much simpler again now. fhahn: Oh yes, now I see where the confusion was coming from. I was thinking about the reduction tree… | |||||

Each builtin returns a scalar equivalent to applying the specified | |||||

It's really not clear what "horizontal recursive pairwise" means unless one has read the mailing list discussions. Maybe you could spell it out, e.g. "recursive even-odd pairwise reduction" or something like that. kparzysz: It's really not clear what "horizontal recursive pairwise" means unless one has read the… | |||||

Thanks, I used that wording! fhahn: Thanks, I used that wording! | |||||

operation(x, y) as recursive even-odd pairwise reduction to all vector | |||||

elements. ``operation(x, y)`` is repeatedly applied to each non-overlapping | |||||

even-odd element pair with indices ``i * 2`` and ``i * 2 + 1`` with | |||||

``i in [0, Number of elements / 2)``. If the numbers of elements is not a | |||||

widening -> widened craig.topper: widening -> widened | |||||

thanks, should be fixed! fhahn: thanks, should be fixed! | |||||

power of 2, the vector is widened with neutral elements for the reduction | |||||

at the end to the next power of 2. | |||||

Example: | |||||

.. code-block:: c++ | |||||

__builtin_reduce_add([e3, e2, e1, e0]) = __builtin_reduced_add([e3 + e2, e1 + e0]) | |||||

Should be restricted to integer types. scanon: Should be restricted to integer types. | |||||

(Never mind, somehow read this as scanon: (Never mind, somehow read this as `&` instead of `\+`.) | |||||

= (e3 + e2) + (e1 + e0) | |||||

Let ``VT`` be a vector type and ``ET`` the element type of ``VT``. | |||||

======================================= ================================================================ ================================== | |||||

Name Operation Supported element types | |||||

======================================= ================================================================ ================================== | |||||

ET __builtin_reduce_max(VT a) return x or y, whichever is larger; If exactly one argument is integer and floating point types | |||||

a NaN, return the other argument. If both arguments are NaNs, | |||||

fmax() return a NaN. | |||||

ET __builtin_reduce_min(VT a) return x or y, whichever is smaller; If exactly one argument integer and floating point types | |||||

is a NaN, return the other argument. If both arguments are | |||||

NaNs, fmax() return a NaN. | |||||

The example above use kito-cheng: The example above use `__builtin_reduce_fadd`, but not listed here? or should we just use… | |||||

Thanks it should be fhahn: Thanks it should be `_add` instead of `_fadd`. fixed. | |||||

ET __builtin_reduce_add(VT a) \+ integer and floating point types | |||||

ET __builtin_reduce_and(VT a) & integer types | |||||

ET __builtin_reduce_or(VT a) \| integer types | |||||

ET __builtin_reduce_xor(VT a) ^ integer types | |||||

======================================= ================================================================ ================================== | |||||

Matrix Types | Matrix Types | ||||

============ | ============ | ||||

Clang provides an extension for matrix types, which is currently being | Clang provides an extension for matrix types, which is currently being | ||||

implemented. See :ref:`the draft specification <matrixtypes>` for more details. | implemented. See :ref:`the draft specification <matrixtypes>` for more details. | ||||

For example, the code below uses the matrix types extension to multiply two 4x4 | For example, the code below uses the matrix types extension to multiply two 4x4 | ||||

float matrices and add the result to a third 4x4 matrix. | float matrices and add the result to a third 4x4 matrix. | ||||

▲ Show 20 Lines • Show All 3,585 Lines • Show Last 20 Lines |

"Prevailing rounding mode" is not super-useful, other than as a spelling for round-to-nearest-ties-to-even (IEEE 754 default rounding). Outside of a

FENV_ACCESS ONcontext, there's not even really a notion of "prevailing rounding mode" to appeal to. I assume the intent is for this to lower to e.g. x86 ROUND* with the dynamic rounding-mode immediate.I would recommend adding

__builtin_elementwise_roundeven(T x)instead, which would statically bind IEEE default rounding (following TS 18661-1 naming) without having to appeal to prevailing rounding mode, and can still lower to ROUND* on x86 outside of FENV_ACCESS ON contexts, which is the norm for vector code (and FRINTN unconditionally on armv8). I think we can punt on rint/nearbyint for now, and add them in the future if there's a need.