This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Add backend support for splats of immediates
ClosedPublic

Authored by cameron.mcinally on Feb 19 2020, 11:33 AM.

Details

Summary

This patch adds backend support for splats of both Int and FP immediates.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptFeb 19 2020, 11:33 AM
efriedma added inline comments.Feb 19 2020, 2:11 PM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13

Is there some reason you can't use the existing cpy_imm8_opt_lsl_i8?

342

This is a little inconsistent with the other patterns: I think we also need need nxv2f32/nxv4f16/nxv2f16?

llvm/test/CodeGen/AArch64/sve-vector-splat.ll
306

It looks like splat_nxv4f32_imm is returning the integer 1, not the floating-point 1.0?

Update 2 out of 3 @efriedma comments.

cameron.mcinally marked 4 inline comments as done.Feb 19 2020, 3:08 PM
cameron.mcinally added inline comments.
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13

I'm not sure I understand this one. What should I replace with cpy_imm8_opt_lsl_i8?

SVE8BitLslImm is looking for two i8 immediates (i8 value and i8 shift amount).

cpy_imm8_opt_lsl_i8 is just checking for one i8 immediate, IINM.

342

Good catch. The f16 tests don't really work, so didn't catch the missing patterns. The nxv2f32 pattern was there.

Added the missing patterns.

llvm/test/CodeGen/AArch64/sve-vector-splat.ll
306

Just a bad copy-and-paste. Updated.

Maybe worth changing the test to use "-mattr=+sve,+fullfp16", so the f16 tests work?

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13

"class imm8_opt_lsl" has some code which looks like it supposed to be used for matching. Granted, it's using ImmLeaf, so maybe it's just broken.

cameron.mcinally marked 2 inline comments as done.Feb 19 2020, 5:41 PM

Maybe worth changing the test to use "-mattr=+sve,+fullfp16", so the f16 tests work?

Good idea. I'll do that.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13

Oh, yeah, I see what you mean now. I'll dig into that...

cameron.mcinally marked an inline comment as done.Feb 20 2020, 9:57 AM
cameron.mcinally added inline comments.
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13

Looking closer, this seems to be in line with the other ComplexPattern uses (e.g. SVEAddSubImm8Pat).

The SVE8BitLslImm ComplexPattern is used to match the two i8 operands, $imm and $shift. If I tried to use cpy_imm8_opt_lsl_i64 in its place, I don't see a way to get the individual operands by name. E.g. something like:

def : Pat<(nxv2i64 (AArch64dup (i64 (cpy_imm8_opt_lsl_i64 i64:$imm)))),
          (DUP_ZI_D $???, $???)>;
efriedma added inline comments.Feb 20 2020, 10:51 AM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13

If you write something like SVE8BitLslImm:$a, it actually counts as two operands, I think, because that's what the ComplexPattern specifies? Not completely sure.

Orthogonal to that, if we aren't going to use the predicates in cpy_imm8_opt_lsl_i64 etc., we should get rid of them.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13

If you write something like SVE8BitLslImm:$a, it actually counts as two operands, I think, because that's what the ComplexPattern specifies? Not completely sure.

SVE8BitLslImm usage would be something like this:

(SVE8BitLslImm i32:$a, i32:$b)

where a is an immediate value and b is a shift amount. I think that's necessary, having two separate named immediates, since DUP_ZI_x expects two immediate operands. E.g.

def : Pat<(nxv2i64 (AArch64dup (i64 (SVE8BitLslImm i32:$a, i32:$b)))),
          (DUP_ZI_D $a, $b)>;

Orthogonal to that, if we aren't going to use the predicates in cpy_imm8_opt_lsl_i64 etc., we should get rid of them.

We do use cpy_imm8_opt_lsl_i64 in DUP_ZI_x right now, but it's slightly different:

def : InstAlias<"mov $Zd, $imm",
                (!cast<Instruction>(NAME # _D) ZPR64:$Zd, cpy_imm8_opt_lsl_i64:$imm), 1>;

cpy_imm8_opt_lsl_i64 will match to (ops i32imm, i32imm), which are the two operands of the DUP_ZI_x. This seems desirable since the hardware wants a single immediate constructed from the 2 immediate parts. That is, we want a way to verify that the two immediates actually make sense together.

All this is really at the edge of my TblGen understanding, so take with some skepticism...

efriedma added inline comments.Feb 20 2020, 1:26 PM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
13

Looking a little more, I think I'm just wrong on the ComplexPattern thing.

We do use cpy_imm8_opt_lsl_i64 in DUP_ZI_x right now, but it's slightly different:

That's not using the IntImm predicate; there's a separate asm operand parser function.

Maybe worth changing the test to use "-mattr=+sve,+fullfp16", so the f16 tests work?

Created D74965 for this, so that we don't have to touch the existing tests here.

Rebase to get +fullfp16 change, and update fp16 tests.

This revision is now accepted and ready to land.Feb 21 2020, 10:27 AM
This revision was automatically updated to reflect the committed changes.