ptrue patterns other other than VL1.
The current algorithm identifies svdup intrinsic calls where the
predicate is a ptrue with the pattern VL1. It is possible to also
perform this optimisation for ptrue patterns in the range VL1 to VL8
inclusive.
Suppose we have a svdup intrinsic call:
svdup (vec, (ptrue pat), elm)
If pat is VL1, we perform the optimisation as before. Else, if pat is in
the range VL2 to VL8 inclusive, we check that the only uses of this
svdup are other svdups that overwrite every vector element except for
one -- if so, we can replace the svdup call with an insertelement
instruction. In any other case, we bail out.
Instead of checking for this specific pattern, can we leverage simplifyDemandedVectorEltsIntrinsic? I guess that doesn't currently work with scalable vectors, though...