This patch unifies the data structures we use for mapping instructions from the original loop to their corresponding instructions in the new loop. Previously, we maintained two distinct maps for this purpose: WidenMap and ScalarIVMap. WidenMap maintained the vector values each instruction from the old loop was represented with, and ScalarIVMap maintained the scalar values each scalarized induction variable was represented with. With this patch, all values created for the new loop (vector and scalar) are maintained in WidenMap (renamed to VectorLoopValueMap).
The change allows for several simplifications. Previously, when an instruction was scalarized, we had to insert the scalar values into vectors in order to maintain the mapping in WidenMap. Then, if a user of the scalarized value was also scalar, we had to extract the scalar values from the temporary vector we created. This caused unnecessary scalar-to-vector-to-scalar conversions, resulting in some cases in significant code bloat pre-InstCombine.
This patch avoids these unnecessary conversions by maintaining the scalar values directly. This can improve compile-time since it reduces the number of instructions in each block that InstCombine needs to simplify. If a scalarized value is used by a scalar instruction, the scalar value can be used directly. However, if the scalarized value is needed by a vector instruction, we will now generate the needed insertelement instructions on-demand.
A common idiom in several locations in the code (including the scalarization code), is to first get the vector values an instruction from the original loop maps to, and then extract a particular scalar value. This patch adds getScalarValue for this purpose along side getVectorValue as an interface into WidenMap. getScalarValue reuses a scalar value if available, or creates an extractelement instruction if a scalar value is not yet available. Similarly, getVectorValue has been modified to generate insertelement instructions on-demand if a vector version of a value is not yet available.
There's no real functional change with this patch (post-InstCombine); only compile-time savings and refactoring. However, in some cases we will generate different code. Because we now might generate an insertelement sequence on-demand, the IR post-InstCombine might be reordered somewhat. For example, instead of the insertelement sequence following the definition of an instruction, it will now precede the first use of that instruction. This can be seen in the test case changes.