These are marked to be "as cheap as a move".
According to publicly available Software Optimization Guides, they
have one cycle latency and maximum throughput only on some
microarchitectures, only for LSL and only for some shift amounts.
This patch uses the subtarget feature FeatureLSLFast to determine
how cheap the instructions are. As a consequence, each subtarget
with FeatureLSLFast now also has FeatureCustomCheapAsMoveHandling added.