diff --git a/llvm/docs/ProgrammersManual.rst b/llvm/docs/ProgrammersManual.rst --- a/llvm/docs/ProgrammersManual.rst +++ b/llvm/docs/ProgrammersManual.rst @@ -2470,6 +2470,99 @@ choice for representing sets which have lots of very short ranges. E.g. the set `{2*x : x \in [0, n)}` would be a pathological input. +.. _utility_functions: + +Useful Utility Functions +======================== + +LLVM implements a number of general utility functions used acrossed the +codebase. You can find the most common ones in ``STLExtras.h`` +(`doxygen `__). Some of these wrap +well-known C++ standard library functions, while others are unique to LLVM. + +.. _uf_iteration: + +Iterating over ranges +--------------------- + +Sometimes you may want to iterate over more than range at a time or know the +index of the index. LLVM provides custom utility functions to make that easier, +without having to manually manage all iterators and/or indices: + +.. _uf_zip: + +The ``zip``\ * functions +^^^^^^^^^^^^^^^^^^^^^^^^ + +``zip``\ * functions allow for iterating over elements from two or more ranges +at the same time. For example: + +.. code-block:: c++ + + SmallVector Counts = ...; + char Letters[26] = ...; + for (auto [Letter, Count] : zip_equal(Letters, Counts)) + errs() << Letter << ": " << Count << "\n"; + +Note that the elements are provided through a 'reference wrapper' proxy type +(tuple of references), which combined with the structured bindings declaration +makes ``Letter`` and ``Count`` references to range elements. Any modification +to these references will affect the elements of ``Letters`` or ``Counts``. + +The ``zip``\ * functions support temporary ranges, for example: + +.. code-block:: c++ + + for (auto [Letter, Count] : zip(SmallVector{'a', 'b', 'c'}, Counts)) + errs() << Letter << ": " << Count << "\n"; + +The difference between the functions in the ``zip`` family is how they behave +when the supplied ranges have different lengths: + +* ``zip_equal`` -- requires all input ranges have the same length. +* ``zip`` -- iteration stops when the end of the shortest range is reached. +* ``zip_first`` -- requires the first range is the shortest one. +* ``zip_longest`` -- iteration continues until the end of the longest range is + reached. The non-existent elements of shorter ranges are replaced with + ``std::nullopt``. + +The length requirements are checked with ``assert``\ s. + +As a rule of thumb, prefer to use ``zip_equal`` when you expect all +ranges to have the same lengths, and consider alternative ``zip`` functions only +when this is not the case. This is because ``zip_equal`` clearly communicates +this same-length assumption and has the best (release-mode) runtime performance. + +.. _uf_enumerate: + +``enumerate`` +^^^^^^^^^^^^^ + +The ``enumerate`` functions allows to iterate over one or more ranges while +keeping track of the index of the current loop iteration. For example: + +.. code-block:: c++ + + for (auto [Idx, BB, Value] : enumerate(Phi->blocks(), + Phi->incoming_values())) + errs() << "#" << Idx << " " << BB->getName() << ": " << *Value << "\n"; + +The current element index is provided as the first structured bindings element. +Alternatively, the index and the element value can be obtained with the +``index()`` and ``value()`` member functions: + +.. code-block:: c++ + + char Letters[26] = ...; + for (auto En : enumerate(Letters)) + errs() << "#" << En.index() << " " << En.value() << "\n"; + +Note that ``enumerate`` has ``zip_equal`` semantics and provides elements +through a 'reference wrapper' proxy, which makes them modifiable when accessed +through structured bindings or the ``value()`` member function. When two or more +ranges are passed, ``enumerate`` requires them to have equal lengths (checked +with an ``assert``). + .. _debugging: Debugging