If we know the 2 halves of an oversized zext-in-reg are the same, don't create those halves independently.
I tried several different approaches to fold this, but it's difficult to get right during legalization. In the default path, we are creating a generic shuffle that looks like an unpack high, but it can get transformed into a different mask (a blend), so it's not straightforward to match that. If we try to fold after it actually becomes an X86ISD::UNPCKH node, we can't be sure what the operand node is - it might be a generic shuffle, or it could be some x86-specific op.
I thought we had some utility to determine if a mask had an any-size splat subset pattern, but I don't see it, so I wrote a small match mask helper for this 1 case.
From the test output, we should be doing something like this for SSE4.1 as well, but I'd rather leave that as a follow-up since it involves changing lowering actions.