While there is still some discreteness within that new group,
it is clearly separate from the other shifts.
This is purely mechanical change, it does not change any numbers,
as the [lack of the change of] mca tests show.
I'm guessing FeatureSlowSHLD is related.
Agner's tables agree, these double shifts are clearly different
from the normal shifts/rotates.
I'm not sure a basic sched pair is the best match for double shifts, at least on AMD targets the imm/reg versions and the RMW versions are all pretty different in their behaviours - see X86ScheduleBtVer2.td.
I'd be a lot happier if we had better perf numbers for all of these (we did some for Jaguar but nothing else - we need some decent Intel tests for starters) and then we can decide how to create the classes to match.
@lebedev.ri What do you think?