I was looking at a potential DAGCombiner fix for 1 of the regressions in D60278, and it caused severe regression test pain because x86 TLI lies about the desirability of 8-bit shift ops.
We've hinted at making all 8-bit ops undesirable for the reason in the code comment:
// TODO: Almost no 8-bit ops are desirable because they have no actual // size/speed advantages vs. 32-bit ops, but they do have a major // potential disadvantage by causing partial register stalls.
...but that leads to massive diffs and exposes all kinds of optimization holes itself. So I'm trying for 1 tiny opcode step at a time. :)