This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] A tablegen pattern to handle pmull2
AbandonedPublic

Authored by mingmingl on Aug 2 2022, 1:35 PM.

Details

Reviewers
efriedma
dmgreen
Summary

Teach llvm.aarch64.neon.pmull64 to use higher half registers when
exactly one operand is already in the higher half.

  • The other non-higher-half operand must be in FP & SIMD register regardless of this patch. So dup it to the right lane rather than fmov-ing higher-half operand to the lower-half.

This is at least a tie, and in most cases a win; say {pmull, pmull2} instruction execute on both higher-half and lower-half of the same source operand (e.g., llvm/test/CodeGen/AArch64/aarch64-pmull2.ll)

Diff Detail

Event Timeline

mingmingl created this revision.Aug 2 2022, 1:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 2 2022, 1:35 PM
mingmingl requested review of this revision.Aug 2 2022, 1:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 2 2022, 1:35 PM
mingmingl updated this revision to Diff 449432.Aug 2 2022, 2:05 PM

the missing part is parameter canonicalization (left, right)

efriedma added inline comments.
llvm/test/CodeGen/AArch64/pmull-ldr-merge.ll
31–33

Are you intentionally avoiding the variant of "dup" that takes a GPR operand (DUPv2i64gpr)?

mingmingl marked an inline comment as done.Aug 2 2022, 2:20 PM
mingmingl added inline comments.
llvm/test/CodeGen/AArch64/pmull-ldr-merge.ll
31–33

Nope, no intention to hack this.

It's good that the FPR/SIMD version of DUP is generated, since it takes 2 cycles; while the GPR version takes 3 cycles (so more expensive than the MOV from FPR to GPR).

I think it's handled by some combining / peephole, but need to verify it.

Matt added a subscriber: Matt.Aug 2 2022, 8:22 PM
mingmingl updated this revision to Diff 449532.Aug 2 2022, 9:52 PM
mingmingl marked an inline comment as done.

Add test cases for 1) LHS/RHS canonicalization 2) non-higher-half operand is from GPR.

llvm/test/CodeGen/AArch64/pmull-ldr-merge.ll
31–33

Added test5 in llvm/test/CodeGen/AArch64/pmull-ldr-merge.ll to show that this is a win when one source operand is from GPR (%2 in define void @test5(ptr %0, <2 x i64> %1, i64 %2) {).

mingmingl updated this revision to Diff 449533.Aug 2 2022, 10:03 PM

resolve "patch application failed" error

mingmingl edited the summary of this revision. (Show Details)Aug 2 2022, 10:05 PM
mingmingl edited the summary of this revision. (Show Details)Aug 2 2022, 10:08 PM
mingmingl updated this revision to Diff 449540.Aug 2 2022, 10:29 PM

create patch from the same branch as diffbase to resolve "patch application failed" error

mingmingl updated this revision to Diff 449541.Aug 2 2022, 10:31 PM

mechanical change to fix test name

mingmingl abandoned this revision.Aug 2 2022, 10:35 PM

The local patches somehow messed up.. Apologize for this.

I'm going to abandon this and re-create patches..

I'm going to abandon this, and sending out stacked diff.

Apologize for the spam :-(