This is an archive of the discontinued LLVM Phabricator instance.

[x86] swap order of srl (and X, C1), C2 when it saves size
ClosedPublic

Authored by spatel on Sep 22 2017, 9:13 AM.

Details

Summary

The (non-)obvious win comes from saving 3 bytes by using the 0x83 'and' opcode variant instead of 0x81. There are also better improvements based on known-bits that allow us to eliminate the mask entirely.

As noted, this could be extended. There are potentially other wins from always shifting first, but doing that reveals a tangle of problems in other pattern matching. We do this transform generically in instcombine, but we often have icmp IR that doesn't match that pattern.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel created this revision.Sep 22 2017, 9:13 AM
craig.topper added inline comments.Sep 22 2017, 9:45 AM
lib/Target/X86/X86ISelLowering.cpp
31785 ↗(On Diff #116354)

What about a larger than 32-bit and mask that would allow us to use a 32-bit and? Otherwise we use a movabsq to load the immediate.

31837 ↗(On Diff #116354)

Not related to this patch, but shouldn't that be "Arithmetic" not "Algebraic"?

test/CodeGen/X86/urem-i8-constant.ll
10 ↗(On Diff #116354)

It's not immediately obvious to me how moving 0x7000 right by 12 bits turned into a mozwl.

craig.topper added inline comments.Sep 22 2017, 10:16 AM
test/CodeGen/X86/urem-i8-constant.ll
10 ↗(On Diff #116354)

Oh there's magic in SelectionDAGISel::CheckAndMask that I never knew about.

spatel added inline comments.Sep 22 2017, 10:35 AM
lib/Target/X86/X86ISelLowering.cpp
31785 ↗(On Diff #116354)

Yes, that's a limitation. I'll have to check if that causes regressions for the other patterns. Ok to make that a TODO in this patch?

31837 ↗(On Diff #116354)

'Algebraic' is the IBM / Power lingo:
https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.alangref/idalangref_srawi_srai_instrs.htm
...which is probably why I chose it. I can make it 'Arithmetic' to be more x86.

test/CodeGen/X86/urem-i8-constant.ll
10 ↗(On Diff #116354)

Hmm...right, there's a very late computeKnownBits that I didn't see either. I did write a dirty test program to confirm that it's not miscompiling for any 8-bit urem (still waiting for Alive to come back).

craig.topper accepted this revision.Sep 22 2017, 10:46 AM

LGTM

lib/Target/X86/X86ISelLowering.cpp
31785 ↗(On Diff #116354)

TODO is fine.

This revision is now accepted and ready to land.Sep 22 2017, 10:46 AM
spatel added inline comments.Sep 23 2017, 7:36 AM
lib/Target/X86/X86ISelLowering.cpp
31785 ↗(On Diff #116354)
This revision was automatically updated to reflect the committed changes.