diff --git a/llvm/docs/MemorySSA.rst b/llvm/docs/MemorySSA.rst --- a/llvm/docs/MemorySSA.rst +++ b/llvm/docs/MemorySSA.rst @@ -256,20 +256,78 @@ ``MemoryDef`` or ``MemoryPhi`` of said ``MemoryDef``. -Build-time use optimization +Use and Def optimization --------------------------- -``MemorySSA`` will optimize some ``MemoryAccess``\ es at build-time. -Specifically, we optimize the operand of every ``MemoryUse`` to point to the +``MemoryUse``\ s keep a single operand, which is their defining or optimized +access. +Traditionally ``MemorySSA`` optimized ``MemoryUse``\ s at build-time, up to a +given threshold. +Specifically, the operand of every ``MemoryUse`` was optimized to point to the actual clobber of said ``MemoryUse``. This can be seen in the above example; the second ``MemoryUse`` in ``if.end`` has an operand of ``1``, which is a ``MemoryDef`` from the entry block. This is done to make walking, value numbering, etc, faster and easier. +As of `this revision `_, the default was +changed to not optimize uses at build time, in order to provide the option to +reduce compile-time if the walking is not necessary in a pass. Most users call +the new API ``ensureOptimizedUses()`` to keep the previous behavior and do a +one-time optimization of ``MemoryUse``\ s, if this was not done before. +New pass users are recommended to call ``ensureOptimizedUses()``. + +Initially it was not possible to optimize ``MemoryDef`` in the same way, as we +restricted ``MemorySSA`` to one operand per access. +This was changed and ``MemoryDef``\ s now keep two operands. +The first one, the defining access, is +always the previous ``MemoryDef`` or ``MemoryPhi`` in the same basic block, or +the last one in a dominating predecessor if the current block doesn't have any +other accesses writing to memory. This is needed for walking Def chains. +The second operand is optimized access, if there was a previous call on the +walker's `getClobberingMemoryAccess(MA)`. This API will cache information +as part of `MA`. +Optimizing all ``MemoryDefs`` has quadratic time complexity and is not done +by default. + +A walk of the uses for any MemoryDef can find the accesses that were optimized +to it. +A code snippet for such a walk looks like this: + +.. code-block:: c++ + MemoryDef *Def; // find who's optimized or defining for this MemoryDef + for (auto& U : Def->uses()) { + MemoryAccess *MA = cast(Use.getUser()); + if (auto *DefUser = cast_of_nullMA) + if (DefUser->isOptimized() && DefUser->getOptimized() == Def) { + // User who is optimized to Def + } else { + // User who's defining access is Def; optimized to something else or not optimized. + } + } -It is not possible to optimize ``MemoryDef`` in the same way, as we -restrict ``MemorySSA`` to one memory variable and, thus, one Phi node -per block. +When ``MemoryUse``\ s are optimized, for a given store, you can find all loads +clobbered by that store by walking the immediate and transitive uses of +the store. + +.. code-block:: c++ + checkUses(MemoryAccess *Def) { // Def can be a MemoryDef or a MemoryPhi. + for (auto& U : Def->uses()) { + MemoryAccess *MA = cast(Use.getUser()); + if (auto *MU = cast_of_nullMA) { + // Process MemoryUse as needed. + } + else { + // Process MemoryDef or MemoryPhi as needed. + + // As a user can come up twice, as an optimized access and defining + // access, keep a visited list. + + // Check transitive uses as needed + checkUses (MA); // use a worklist for an iterative algorithm + } + } + } +An example of similar traversals can be found in the DeadStoreElimination pass. Invalidation and updating ------------------------- @@ -277,7 +335,11 @@ Because ``MemorySSA`` keeps track of LLVM IR, it needs to be updated whenever the IR is updated. "Update", in this case, includes the addition, deletion, and motion of ``Instructions``. The update API is being made on an as-needed basis. -If you'd like examples, ``GVNHoist`` is a user of ``MemorySSA``\ s update API. +If you'd like examples, ``GVNHoist`` and ``LICM`` are users of ``MemorySSA``\ s +update API. +Note that adding new ``MemoryDef``\ s (by calling ``insertDef``) can be a +time-consuming update, if the new access triggers many ``MemoryPhi`` insertions and +renaming (optimization invalidation) of many ``MemoryAccesses``\ es. Phi placement @@ -403,17 +465,17 @@ ^^^^^^^^^^^^^^^^^^^^^ In practice, there are implementation details in LLVM that also affect the -results' precision provided by MemorySSA. For example, AliasAnalysis has various -caps, or restrictions on looking through phis which can affect what MemorySSA +results' precision provided by ``MemorySSA``. For example, AliasAnalysis has various +caps, or restrictions on looking through phis which can affect what ``MemorySSA`` can infer. Changes made by different passes may make MemorySSA either "overly optimized" (it can provide a more acccurate result than if it were recomputed from scratch), or "under optimized" (it could infer more if it were recomputed). This can lead to challenges to reproduced results in isolation with a single pass -when the result relies on the state aquired by MemorySSA due to being updated by +when the result relies on the state aquired by ``MemorySSA`` due to being updated by multiple subsequent passes. Passes that use and update MemorySSA should do so through the APIs provided by the -MemorySSAUpdater, or through calls on the Walker. -Direct optimizations to MemorySSA are not permitted. +``MemorySSAUpdater``, or through calls on the Walker. +Direct optimizations to ``MemorySSA`` are not permitted. There is currently a single, narrowly scoped exception where DSE (DeadStoreElimination) updates an optimized access of a store, after a traversal that guarantees the optimization is correct. This is solely allowed due to the traversals and inferences @@ -422,15 +484,6 @@ help reproduce optimizations in isolation. -Use Optimization -^^^^^^^^^^^^^^^^ - -Unlike other partitioned forms, LLVM's ``MemorySSA`` does make one -useful guarantee - all loads are optimized to point at the thing that -actually clobbers them. This gives some nice properties. For example, -for a given store, you can find all loads actually clobbered by that -store by walking the immediate uses of the store. - LLVM Developers Meeting presentations -------------------------------------