This adapts/copies code from the existing fold that allows widening of load scalar+insert. It can help in IR because it removes a shuffle, and the backend can already narrow loads if that is profitable in codegen.
We might be able to consolidate more of the logic, but handling this basic pattern should be enough to make a small difference on one of the motivating examples from issue #17113. The final goal of combining loads on those patterns is not solved though.
Note: I created this helper function to reduce code duplication. I didn't push that NFC change to main yet though (pending feedback here to do something different).