Splat loads are inexpensive in X86. For a 2-lane vector we need just one
instruction: movddup (%reg), xmm0. Using the standard Splat score leads
to worse code. This patch adds a new score dedicated for splat loads.
Please note that a splat is usually three IR instructions:
- It is usually a load and 2 inserts:
%ld = load double, double* %gep %ins1 = insertelement <2 x double> poison, double %ld, i32 0 %ins2 = insertelement <2 x double> %ins1, double %ld, i32 1
- But it can also be a load, an insert and a shuffle:
%ld = load double, double* %gep %ins = insertelement <2 x double> poison, double %ld, i32 0 %shf = shufflevector <2 x double> %ins, <2 x double> poison, <2 x i32> zeroinitializer
Because of this some of the lit tests contain more IR instructions.
Why do we need this? Why not just use getShuffleCost?