Index: llvm/docs/LoopTerminology.rst =================================================================== --- llvm/docs/LoopTerminology.rst +++ llvm/docs/LoopTerminology.rst @@ -161,11 +161,134 @@ has a predecessor that is outside the loop. This implies that all exit blocks are dominated by the loop header. +.. _loop-terminology-lcssa: Loop Closed SSA (LCSSA) ======================= -TBD +A program is in Loop Closed SSA Form if it is in SSA form +and all values that are defined in a loop are used only inside +this loop. +Programs written in LLVM IR are always in SSA form but not necessarily +in LCSSA. To achieve the latter, single entry PHI nodes are inserted +at the end of the loops for all values that are live across the loop boundary. +In particular, consider the following loop: + +.. code-block:: C + + c = ...; + for (...) { + if (c) + X1 = ... + else + X2 = ... + X3 = phi(X1, X2); // X3 defined + } + + ... = X3 + 4; // X3 used, i.e. live + // outside the loop + +In the inner loop, the X3 is defined inside the loop, but used +outside of it. In loop closed SSA form, this would be represented like this: + +.. code-block:: C + + c = ...; + for (...) { + if (c) + X1 = ... + else + X2 = ... + X3 = phi(X1, X2); + } + X4 = phi(X3); + + ... = X4 + 4; + +This is still valid LLVM; the extra phi nodes are purely redundant, +but all LoopPass'es are required to preserve them. +This form is ensured by the LCSSA (:ref:`-loop-simplify `) +pass and is added automatically by the LoopPassManager when +scheduling a LoopPass. +After the loop optimizations are done, these extra phi nodes +will be deleted by :ref:`-instcombine `. + +The major benefit of this transformation is that it makes many other +loop optimizations simpler. + +First of all, a simple observation is that if one needs to see all +the outside users, they can just iterate over all the (loop closed) +PHI nodes in the exit blocks. + +Then, consider for example +:ref:`-loop-unswitch ` ing the loop above. +Because it is in LCSSA form, we know that any value defined inside of +the loop will be used either only inside the loop or in a loop closed +PHI node. In this case, the only loop closed PHI node is X4. +This means that we can just copy the loop and change the X4 +accordingly, like so: + +.. code-block:: C + + for (...) { + c = ...; + if (c) { + for (...) { + if (true) + X1 = ... + else + X2 = ... + X3 = phi(X1, X2); + } + } else { + for (...) { + if (false) + X1' = ... + else + X2' = ... + X3' = phi(X1', X2'); + } + } + X4 = phi(X3, X3') + +Now, all uses of X4 will get the updated value. +If we did not have Loop Closed SSA form, it means that X3 could +possibly be used outside the loop. So, we would have to introduce the +X4 (which is the new X3) and replace all uses of X3 with that. + +In general, if a loop is in LCSSA form, in any loop transformation, +we only need to update the loop closed PHI nodes for the changes +to take effect. However, we should note one thing: +LLVM keeps all the uses of a value in an internal data structure. +That means we don't have to do deep analysis to find and replace +all uses of a value, we can just iterate this data structure (there +is even a utility function, replaceAllUsesWith(), that performs +this transformation). + +Another important advantage is that the behavior of all uses +of an induction variable is the same. Without this, you need to +distinguish the case when the variable is used outside of +the loop it is defined in, for example: + +.. code-block:: C + + for (i = 0; i < 100; i++) { + for (j = 0; j < 100; j++) { + k = i + j; + use(k); // use 1 + } + use(k); // use 2 + } + +Looking from the outer loop with the normal SSA form, the first use of k +is not well-behaved, while the second one is an induction variable with +base 100 and step 1. Although, in practice, and in the LLVM context, +SCEV should have no problem with it (it's just two AddRecExpr +in a SCEVExpr). + +Another similar advantage is that with LCSSA, ScalarEvolution::getSCEV +is sufficient. Otherwise, one needs to use getSCEVAtScope +to specify whether we are using the SCEV inside or outside the loop. "More Canonical" Loops ====================== Index: llvm/docs/Passes.rst =================================================================== --- llvm/docs/Passes.rst +++ llvm/docs/Passes.rst @@ -711,6 +711,8 @@ In this case, the unconditional branch at the end of the first if can be revectored to the false side of the second if. +.. _passes-lcssa: + ``-lcssa``: Loop-Closed SSA Form Pass ------------------------------------- @@ -732,7 +734,8 @@ This is still valid LLVM; the extra phi nodes are purely redundant, and will be trivially eliminated by ``InstCombine``. The major benefit of this transformation is that it makes many other loop optimizations, such as -``LoopUnswitch``\ ing, simpler. +``LoopUnswitch``\ ing, simpler. You can read more in the +:ref:`loop terminology section for the LCSSA form `. .. _passes-licm: @@ -864,6 +867,8 @@ can lead to significant performance improvements. It uses :ref:`Dependence Analysis ` for proving the transformations are safe. +.. _passes-loop-unswitch: + ``-loop-unswitch``: Unswitch loops ----------------------------------