Index: docs/LanguageExtensions.rst =================================================================== --- docs/LanguageExtensions.rst +++ docs/LanguageExtensions.rst @@ -1770,8 +1770,9 @@ The ``#pragma clang loop`` directive is used to specify hints for optimizing the subsequent for, while, do-while, or c++11 range-based for loop. The directive -provides options for vectorization and interleaving. Loop hints can be specified -before any loop and will be ignored if the optimization is not safe to apply. +provides options for vectorization, interleaving, and unrolling. Loop hints can +be specified before any loop and will be ignored if the optimization is not safe +to apply. A vectorized loop performs multiple iterations of the original loop in parallel using vector instructions. The instruction set of the target @@ -1786,9 +1787,16 @@ a cost model that depends on the register pressure and generated code size to select the interleaving count. -Vectorization is enabled by ``vectorize(enable)`` and interleaving is enabled -by ``interleave(enable)``. This is useful when compiling with ``-Os`` to -manually enable vectorization or interleaving. +Loop unrolling replicates the body of a loop increasing the loop size +and reducing the loop count. Loop control overhead can be reduced or +eliminated, and additional ILP can be exposed. The unroller selects +an unroll count based on a limit on the growth of code size and +whether the loop can be unrolled completely. + +Vectorization is enabled by ``vectorize(enable)``, interleaving is enabled +by ``interleave(enable)``, and unrolling is enabled by ``unrolling(enable)``. +This is useful when compiling with ``-Os`` to manually enable vectorization or +interleaving. .. code-block:: c++ @@ -1798,22 +1806,28 @@ ... } -The vector width is specified by ``vectorize_width(_value_)`` and the interleave -count is specified by ``interleave_count(_value_)``, where -_value_ is a positive integer. This is useful for specifying the optimal -width/count of the set of target architectures supported by your application. +If ``unroll(enable)`` is specified the unroller will attempt to unroll +the loop completely if the trip count is known at compile time. If +the loop count is not known the loop will still be unrolled subject to +a limit on growth of code size. + +The vector width is specified by ``vectorize_width(_value_)``, interleave count +is specified by ``interleave_count(_value_)``, and unroll count is specified by +``unroll_count(_value)``, where _value_ is a positive integer. This is useful +for specifying the optimal width/count of the set of target architectures +supported by your application. .. code-block:: c++ - #pragma clang loop vectorize_width(2) #pragma clang loop interleave_count(2) + #pragma clang loop unroll_count(4) for(...) { ... } Specifying a width/count of 1 disables the optimization, and is equivalent to -``vectorize(disable)`` or ``interleave(disable)``. +``vectorize(disable)``, ``interleave(disable)``, or ``unroll(disable)``. For convenience multiple loop hints can be specified on a single line. Index: docs/ReleaseNotes.rst =================================================================== --- docs/ReleaseNotes.rst +++ docs/ReleaseNotes.rst @@ -101,9 +101,10 @@ ----------------------- Loop optimization hints can be specified using the new `#pragma clang loop` -directive just prior to the desired loop. The directive allows vectorization -and interleaving to be enabled or disabled, and the vector width and interleave -count to be manually specified. See language extensions for details. +directive just prior to the desired loop. The directive allows vectorization, +interleaving, and unrolling to be enabled or disabled. Vector width as well +as interleave and unrolling count can be manually specified. See language +extensions for details. C Language Changes in Clang --------------------------- Index: include/clang/Basic/AttrDocs.td =================================================================== --- include/clang/Basic/AttrDocs.td +++ include/clang/Basic/AttrDocs.td @@ -1016,9 +1016,10 @@ let Category = DocCatStmt; let Content = [{ The ``#pragma clang loop'' directive allows loop optimization hints to be -specified for the subsequent loop. The directive allows vectorization -and interleaving to be enabled or disabled, and the vector width and interleave -count to be manually specified. See `language extensions +specified for the subsequent loop. The directive allows vectorization, +interleaving, and unrolling to be enabled or disabled. Vector width as well +as interleave and unrolling count can be manually specified. See +`language extensions '_ for details. }];