This is an archive of the discontinued LLVM Phabricator instance.

[X86]: Quit promoting 8 and 16 bit compares to 32 bit.
ClosedPublic

Authored by kbsmith1 on Jun 8 2016, 10:48 AM.

Details

Summary

This change effectively just reverts r195496, and updates the tests as needed.
8 and 16 bit compares no longer promoted up into 32 bit compares. This has some
nice performance improvements, especially in eembc/rgbcmykv2. In order for this
not to cause performance regressions in 401.bzip2, changes http://reviews.llvm.org/D21085
are also necessary to get all the necessary movb and movw instructions promoted to
movzbl/movzwl.

Diff Detail

Event Timeline

kbsmith1 updated this revision to Diff 60068.Jun 8 2016, 10:48 AM
kbsmith1 retitled this revision from to [X86]: Quit promoting 8 and 16 bit compares to 32 bit..
kbsmith1 updated this object.
kbsmith1 added a subscriber: llvm-commits.

We talked about this on the list, but getting an explicit ack from Jim.

-eric

echristo accepted this revision.Jun 8 2016, 1:27 PM
echristo edited edge metadata.
This revision is now accepted and ready to land.Jun 8 2016, 1:27 PM
eli.friedman added inline comments.
test/CodeGen/X86/memcmp.ll
44

16-bit immediate operands are bad for performance on modern x86.

kbsmith1 added inline comments.Jun 8 2016, 3:48 PM
test/CodeGen/X86/memcmp.ll
44

I'm looking into changing the code so that 16 bit compares which have a constant operand will continue to get promoted, and how that affects the performance numbers.

kbsmith1 updated this revision to Diff 60244.Jun 9 2016, 2:51 PM
kbsmith1 edited edge metadata.

Updated changes so this will continue to promote 16 bit compares to 32 bits if one of the compare
operands is a constant. This addresses Eli Friedman's comment.

This revision was automatically updated to reflect the committed changes.

Updated changes so this will continue to promote 16 bit compares to 32 bits if one of the compare
operands is a constant. This addresses Eli Friedman's comment.

Yes, 16bit immediate constant may introduce LCP that may end up hurting performance. But on newer architectures (sandybridge and later), this problem is much less severe, especially when the loop body fits in LSD. I don't think it's a good idea to blindly convert all 16bit immediate constant comparison to 32bit.