It has been reported that the vector shift instructions tend to be worse than ADD/SUB on AArch64 cores.
This patch supports tablegen patterns for below simple transformation.
x << 1 ==> x + x
Paths
| Differential D153049
[AArch64] Try to convert vector shift operation into vector add operation ClosedPublic Authored by jaykang10 on Jun 15 2023, 9:58 AM.
Details Summary It has been reported that the vector shift instructions tend to be worse than ADD/SUB on AArch64 cores. x << 1 ==> x + x
Diff Detail
Event TimelineComment Actions LGTM I'm a little concerned this could run into issues along the lines of https://github.com/llvm/llvm-project/issues/49812 ... but I guess we currently don't try to model freeze post-isel, so it's not too likely you'll run into issues. This revision is now accepted and ready to land.Jun 15 2023, 11:47 AM Comment Actions Oh, also, please verify we have testcases to make sure these patterns don't interfere with the formation of sshll/ushll
Comment Actions Thanks for kind comments. @efriedma
As far as I understand, the AArch64 target uses the default SelectionDAGISel code for ISD::FREEZE so the ISD::FREEZE is mapped to the TargetOpcode::COPY. The ISD::UNDEF is mapped to TargetOpcode::IMPLICIT_DEF. The ProcessImplicitDefs pass removes the IMPLICIT_DEF and mark the add's operands with undef. The register allocator assigns same register to the add's operands because it uses same virtual register. Therefore, I think both x << 1 and x + x guarantees even number output.
Yep, let me check the test cases more.
Closed by commit rG82d330e0e04a: [AArch64] Try to convert vector shift operation into vector add operation (authored by jaykang10). · Explain WhyJun 16 2023, 9:15 AM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 532191 llvm/lib/Target/AArch64/AArch64InstrInfo.td
llvm/test/CodeGen/AArch64/arm64-sli-sri-opt.ll
llvm/test/CodeGen/AArch64/arm64-vshift.ll
llvm/test/CodeGen/AArch64/rax1.ll
llvm/test/CodeGen/AArch64/shl-to-add.ll
llvm/test/CodeGen/AArch64/urem-seteq-illegal-types.ll
llvm/test/CodeGen/AArch64/vector_splat-const-shift-of-constmasked.ll
|
AArch64vshl can be used directly in the Pat, if it is always the same,