We have VSX store instructions that will store a single element from a vector without modifying it in any way. Previous generation cores can do this for word-sized elements and Power9 can also do it for half-word and byte-sized elements.
The TableGen patterns for the word-sized versions were missing - this patch adds them.
Furthermore, it provides the information about the cost of such a combine - zero cost when the index is the element number that the instruction stores, cost of 3 for other elements. The cost for the other elements is because a vector permute is needed to shift the element and it's only worthwhile if keeping the value as a vector reduces the cost enough to offset the cost of the permute.
I didn't take deep look at the implementation for this patch. The condition here seems not quite align with the comments. If the bitwidth is 32bit, we will combine the store and extract no matter if it is Power9 or not. I am not sure if this is by intention.