Currently, when encountering store(trunc(..)) where the trunc is double a legal vector in MVE, we spilt the node into two different stores each performing half of the trunc from the wider type. This works well for efficiently lowering widen than legal types (else the trunc becomes a series of individual lane moves). Unfortunately this splitting is currently one of the first combines attempted, so can happen before any other combines, which might be more preferable.
This patch instead introduces the concept of a MVETRUNC ISel node that the trunk is initially lowered to, to keep it intact as a single item as opposed to splitting it up. This allows us to push the store(trunc(..)) combine later, allowing other optimisations to potentially happen on the trunc first. The store(trunc(..)) splitting can then be done later in the legalisation period if needed, or else fall back to a buildvector as before.
This can also be used in the future to lower to loads/stores, as opposed to the more expensive lane extracts/inserts.
Some extra combines are added to keep all the existing tests happy. This is perhaps not the most elegant thing in the world, but does help as can be seen some of the tests. It can also possibly be extended in the future to lower to a stack load/store, which may be more efficient than the expanding at the expense of a stack slot.
Can we instead delay the splitting of a truncate instead of coining a new node for it? I suppose that future stages in the pipeline may depend on it being legal so the splitting would have to happen early.