Depends on D43813. Depends on D41278. We'd like to be albe properly to select SHLD/SHRD instrs or alternative code sequences in dependence on target CPUs. This patch shows the possibility to do it. It supports only one intruction: SHLD(16/32/64)rri8 but it shows how we could do it. If the suggested way is acceptable for us I'll continue with new extended patches.
I tried to split the patch on 2 steps: the debug extension and the support of SHLD itself. Unfortunately I could not find any usage of Machine Combiner except mul-lohi.ll test but this test is not informative enough to show the improvements in the debug print. On the other hand X86 target code does not have any alternative MC code patterns to show the created changes inside existing MC DEBUG() expressions. As result I created this rather big patch including all changes at once.
Most probably I'll create a new extended patch when (and if) we accept this one as initial way to resolve the issue.
Shouldn't a case like this be handled by computeOperandLatency?