This fixes PR23155 by turning on the X86FixupBWInst optimization by default,
and modifies the few tests that are affected by this. The X86FixupBWInst optimization
already has its unit tests, which were checked in with its initial commit.
Details
Diff Detail
Event Timeline
LGTM.
Have you measured any other perf diffs using this pass besides the case in PR23155?
Yes, I have run perf results locally on a number of different kinds of boxes. As stated this does generate exactly the code desired
for PR23155 (modulo the alignment differences noted as having some performance effect in that PR).
Some highlights
+8% on EEMBC/viterb (this was what motivated PR23155)
+3% on spec2000 parser
+2% on spec2000 twolf
+3% on EEMBC/ip_reassemblyIT,tcpbulk,tcpmixed
+2% on EEMBC/qos,natIT
a couple of minor degrades as well
-3 on EEMBC/puwmod01
-2 on coremark/zip-test
-2 on spec2006/libquantum -O2,but +1% -O2 +Haswell specific cpu tuning flags.
So, generally this doesn't make a big difference on benchmarks, but is more positive than negative.