Page MenuHomePhabricator

[AArch64][SVE] Improve code generation for vector_splice for Imm == -1
ClosedPublic

Authored by CarolineConcatto on Jul 8 2021, 7:27 AM.

Details

Summary

This patch implements vector_splice in tablegen for:

a) when the immediate is equal to -1 (Imm==1) and uses:
     INSR  +  LASTB

For instance :
@llvm.experimental.vector.splice(Vector_1, Vector_2, -1)
@llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <D, E, F, G>

LAST   RegLast, Vector_1                // RegLast = D
INSR   Res, (Vec2Rev >> 1), RegLast     // Res = D + E, F, G

Diff Detail

Event Timeline

CarolineConcatto requested review of this revision.Jul 8 2021, 7:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2021, 7:27 AM
CarolineConcatto retitled this revision from [SVE][AArch64] Improve code generation for vector_splice to [WIP] Improve code generation for vector_splice.Jul 8 2021, 7:55 AM
CarolineConcatto removed a reviewer: efriedma.

Hi @CarolineConcatto,

  • For a splice with a positive index, we can use the EXT instruction (if the index fits the immediate range). Because EXT operates on i8 element types, you'll need to take into account the scaling. You can use a ComplexPattern to test if the immediate is in range and then have it scaled, see for example sve_cnt_mul_imm which uses SelectCntImm to check whether the immediate is in range, and then scales the value.
  • For a splice with a negative index, rather than using a REV you could generate a predicate that is: 0, 1, 1, ... 1 using (not(whilelt XZR, -index)).

The most important cases to optimize are: -1 and +1, as those will occur most in practice. In our downstream compiler, we use WHILE+LASTB+INSR instead of SPLICE for the -1 case (which I now realise may save the one not instruction I mention above).

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7428

If all we do is return Op, then you can just as well say it's Legal. However, in my other comment I'm suggesting to do custom-lowering to bitcast the types, so that we don't need as many patterns.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
1242

The type isn't actually relevant for splice, although the container width is. If we can lower the splice operation to always work on integer types (converting the input fp vector, and a casting the result back to the original fp type afterwards), then we don't need patterns for all these cases.

Matt added a subscriber: Matt.Jul 8 2021, 11:44 AM
  • Address Sander's suggestion and use EXT when lane/Imm is bigger than 0
  • and use INSR + LAST for lane/Imm equals to -1
  • Implement SelectImm to use with EXT in attemp to scale index
  • Split the patch to handles only Immediate == -1
  • Move cast Float from/to Integer into DAGCombiner.cpp
CarolineConcatto edited the summary of this revision. (Show Details)Jul 19 2021, 6:16 AM
CarolineConcatto edited the summary of this revision. (Show Details)
  • Move bitcast for AArch64ISelLowering

Can [WIP] be removed from the title?

nit on the commit message:

INSR   Res, (Vec2Rev >> 1), RegLast     // Res = D + G, F, E

should be:

INSR   Res, (Vec2Rev >> 1), RegLast     // Res = D + E, F, G
CarolineConcatto retitled this revision from [WIP] Improve code generation for vector_splice to [SVE][AArch64] Improve code generation for vector_splice.Jul 22 2021, 1:28 AM
CarolineConcatto edited the summary of this revision. (Show Details)
CarolineConcatto added a reviewer: eli.friedman.
CarolineConcatto retitled this revision from [SVE][AArch64] Improve code generation for vector_splice to [SVE][AArch64] Improve code generation for vector_splice for Imm == -1.Jul 22 2021, 1:28 AM
CarolineConcatto retitled this revision from [SVE][AArch64] Improve code generation for vector_splice for Imm == -1 to [AArch64][SVE] Improve code generation for vector_splice for Imm == -1.
CarolineConcatto edited the summary of this revision. (Show Details)
  • update commit message with --verbatim
sdesmalen accepted this revision.Jul 22 2021, 5:01 AM

LGTM!

nit: The commit message still refers to Vec2Rev (I didn't spot that in my previous suggestion)

This revision is now accepted and ready to land.Jul 22 2021, 5:01 AM
This revision was landed with ongoing or failed builds.Jul 26 2021, 3:25 AM
This revision was automatically updated to reflect the committed changes.