[InstCombine] enhance shuffle-of-binops to allow different variable ops (PR37806)
If we have 2 different variable values, then we shuffle (select) those lanes,
shuffle (select) the constants, and then perform the binop. This eliminates a binop.
The new shuffle uses the same shuffle mask as the existing shuffle, so there's no
danger of creating a difficult shuffle.
All of the earlier constraints still apply, but we also check for extra uses to
avoid creating more instructions than we'll remove.
Additionally, we're disallowing the fold for div/rem because that could expose a
Differential Revision: https://reviews.llvm.org/D48678