This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Try to convert vector shift operation into vector add operation
ClosedPublic

Authored by jaykang10 on Jun 15 2023, 9:58 AM.

Details

Summary

It has been reported that the vector shift instructions tend to be worse than ADD/SUB on AArch64 cores.
This patch supports tablegen patterns for below simple transformation.

x << 1  ==>  x + x

Diff Detail

Event Timeline

jaykang10 created this revision.Jun 15 2023, 9:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2023, 9:58 AM
jaykang10 requested review of this revision.Jun 15 2023, 9:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2023, 9:58 AM
efriedma accepted this revision.Jun 15 2023, 11:47 AM

LGTM

I'm a little concerned this could run into issues along the lines of https://github.com/llvm/llvm-project/issues/49812 ... but I guess we currently don't try to model freeze post-isel, so it's not too likely you'll run into issues.

This revision is now accepted and ready to land.Jun 15 2023, 11:47 AM

Oh, also, please verify we have testcases to make sure these patterns don't interfere with the formation of sshll/ushll

dmgreen added inline comments.Jun 15 2023, 1:53 PM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
7030

AArch64vshl can be used directly in the Pat, if it is always the same,

Thanks for kind comments. @efriedma

I'm a little concerned this could run into issues along the lines of https://github.com/llvm/llvm-project/issues/49812 ... but I guess we currently don't try to model freeze post-isel, so it's not too likely you'll run into issues.

As far as I understand, the AArch64 target uses the default SelectionDAGISel code for ISD::FREEZE so the ISD::FREEZE is mapped to the TargetOpcode::COPY. The ISD::UNDEF is mapped to TargetOpcode::IMPLICIT_DEF. The ProcessImplicitDefs pass removes the IMPLICIT_DEF and mark the add's operands with undef. The register allocator assigns same register to the add's operands because it uses same virtual register. Therefore, I think both x << 1 and x + x guarantees even number output.

Oh, also, please verify we have testcases to make sure these patterns don't interfere with the formation of sshll/ushll

Yep, let me check the test cases more.

llvm/lib/Target/AArch64/AArch64InstrInfo.td
7030

Yep, let me update it.

jaykang10 updated this revision to Diff 532072.Jun 16 2023, 3:17 AM

Added sshll/ushll test cases which should not be affected by this patch.