If a vmv.s.x has an undef passthru and the scalar is a constant that
fits into 5 bits, then we can use a vmv.v.i to save an li. If the VL is
a constant > 0, then we can also just set that to 1 to allow for more
toggle removal opportunities.
This patch adds a combine for this, and also teaches the insert vsetvli
pass how to treat vmv.v.i similarly to vmv.s.x, where we can expand the
SEW to avoid a toggle since we're only writing one element. Without the
vsetvli change it actually results in worse codegen because of the
additional toggle.
We could just use the VSETVLIInfo in needVSETVLI to check for this in
the top-to-bottom pass, but then we miss out on the post-lowering
bottom-to-top pass, which catches cases where the vmv.v.i is the first
instruction in an entry block.
As noted in getDemanded, we shouldn't be looking at the VL operand as
it could be stale if the instruction is already lowered.
But I've convinced myself that checking the VL operand is actually ok here,
because even if it's stale, the demanded fields for a vmv.v.i with VL=1
should still be the same, regardless of what its actual lowered VL is.
(A sanity check here would be greatly appreciated!)
We may need to limit it to LMUL<=1 in order not to increase register pressure.
Similar patches: D139656 and its related patches.