When trying to vectorise certain loops using scalable vectors we try to
use the replicate recipe even for non-uniform cases. The recipe does
not handle scalable vectors correctly because it tries to scalarise
the vector instruction by generating a scalar instance for each lane
and packing them into a vector. We don't know the number of lanes at
runtime so this is not an option.
I've decided to create a new scalable replicate recipe instead called
VPScalableReplicate, which also calls a new overloaded version of
scalarizeInstruction that generates a whole vector part in one go,
instead of generating N scalar instances for N lanes. The new version
of scalarizeInstruction is based on the original version, and currently
only supports certain cases, such as GEP instructions, or instructions
with loop-invariant operands.
Tests have been added here: