Previously we just matched the logic ops and replaced with an
X86ISD::VPTERNLOG node that we would send through the normal
pattern match. But that approach couldn't handle a bitcast
between the logic ops. Extending that approach would require us
to peek through the bitcasts and emit new bitcasts to match
the types. Those new bitcasts would then have to be properly
topologically sorted.
This patch instead switches to directly emitting the
MachineSDNode and skips the normal tablegen pattern matching.
We do have to handle load folding and broadcast load folding
ourselves now. Which also means commuting the immediate control.