We can consolidate code and clarify edge case behavior at the same time.
There are two functional differences here.
First, I remove the ResVT handling, and always use the reduction element type. This appears to be dead code. There's no test coverage, and I think such a construct wouldn't be legal anyways. This is the main reason I posted this for review as I want someone to double check me on this point.
Second, if the VL happens to be known non-zero, we can avoid passing through start. This is mostly needed to allow reuse of the existing code; I don't consider it interesting as an optimization on it's own.