This is an archive of the discontinued LLVM Phabricator instance.

[X86] Optimize (and (srl X 30) 2)
Needs ReviewPublic

Authored by kazu on Mar 6 2023, 9:49 PM.

Details

Summary

This patch transforms (and (srl X 30) 2) into (add Y Y), where
Y = (srl X 31). In x86 assembly:

c1 e8 1e                   shr    $0x1e,%eax
83 e0 fe                   and    $0xfffffffe,%eax

is transformed into:

c1 e8 1f                   shr    $0x1f,%eax
01 c0                      add    %eax,%eax

When the source and destination operand are different, we can emit
lea:

c1 ef 1f                   shr    $0x1f,%edi
8d 04 3f                   lea    (%rdi,%rdi,1),%eax

eliminating the need for a mov instruction.

This patch fixes:

https://github.com/llvm/llvm-project/issues/61073

Diff Detail

Event Timeline

kazu created this revision.Mar 6 2023, 9:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2023, 9:49 PM
kazu requested review of this revision.Mar 6 2023, 9:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2023, 9:49 PM

I'm worried that this is very specific code for something I think we might be able to do with more generic folds (and possibly a X86ISelDAGToDAG tweak).

llvm/test/CodeGen/X86/and-shift.ll
56

We might be able to get SimplifyDemandedBits to recognise this as shrl $30, %eax (with suitable TLI checks)

RKSimon added inline comments.Mar 7 2023, 3:10 PM
llvm/test/CodeGen/X86/and-shift.ll
56

DAGCombiner::visitANDLike should already handle this (it might be failing to peek through a truncation)

RKSimon added a comment.EditedJul 18 2023, 2:10 AM

@kazu Rebase this now that D146121 has landed?