Rotate by 1 is translated to 1 micro-op, while rotate with imm8 is translated to 2 micro-ops.
Fixes pr30644.
Paths
| Differential D25399
[X86] Prefer rotate by 1 over rotate by imm ClosedPublic Authored by zvi on Oct 8 2016, 12:08 PM.
Details Summary Rotate by 1 is translated to 1 micro-op, while rotate with imm8 is translated to 2 micro-ops. Fixes pr30644.
Diff Detail
Event Timelinezvi updated this object.
Comment Actions Please can you confirm that the _lrotl/_lrotr intrinsics (I think they are only on clang-cl) can't create the removed rotr #1 pattern Comment Actions
Seems like for 64-bit operand intrinsics we are not matching any rotate instructions: unsigned long ltestl(unsigned long _Value) { return _lrotl(_Value, 1); } leaq (%rdi,%rdi), %rax shrq $31, %rdi orq %rdi, %rax retq unsigned long ltestr(unsigned long _Value) { return _lrotr(_Value, 1); } movq %rdi, %rax shrq %rax shlq $31, %rdi orq %rax, %rdi movq %rdi, %rax retq But the 32-bit operand flavors are matching rotate instructions (these are withour this patch): unsigned long testl(unsigned long _Value) { return _rotl(_Value, 1); } roll %edi movq %rdi, %rax retq unsigned long testr(unsigned long _Value) { return _rotr(_Value, 1); } roll $31, %edi movq %rdi, %rax retq The 64-bit operand issue seems like out of scope of this patch, since the matching to ISD::ROTR/ROTL nodes happens in the target-independent phase of DagCombine. I will open a Bugzilla for this and will investigate it further. Comment Actions Actually, the relevant 64-bit operand intrinsics are _rotl64 and _rotr64 for which the generated code without this patch is: test_rotl64(unsigned long): # @test_rotl64(unsigned long) rolq %rdi movq %rdi, %rax retq test_rotr64(unsigned long): # @test_rotr64(unsigned long) rolq $63, %rdi movq %rdi, %rax retq This revision is now accepted and ready to land.Oct 10 2016, 7:44 AM
Revision Contents
Diff 74071 lib/Target/X86/X86InstrShiftRotate.td
test/CodeGen/X86/rotate.ll
|
Does rotr by 1 never get emitted or should we really have two patterns here?