This patch adds backend support for splats of both Int and FP immediates.
Details
Diff Detail
Event Timeline
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
13 | Is there some reason you can't use the existing cpy_imm8_opt_lsl_i8? | |
353 | This is a little inconsistent with the other patterns: I think we also need need nxv2f32/nxv4f16/nxv2f16? | |
llvm/test/CodeGen/AArch64/sve-vector-splat.ll | ||
303 | It looks like splat_nxv4f32_imm is returning the integer 1, not the floating-point 1.0? |
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
13 | I'm not sure I understand this one. What should I replace with cpy_imm8_opt_lsl_i8? SVE8BitLslImm is looking for two i8 immediates (i8 value and i8 shift amount). cpy_imm8_opt_lsl_i8 is just checking for one i8 immediate, IINM. | |
353 | Good catch. The f16 tests don't really work, so didn't catch the missing patterns. The nxv2f32 pattern was there. Added the missing patterns. | |
llvm/test/CodeGen/AArch64/sve-vector-splat.ll | ||
303 | Just a bad copy-and-paste. Updated. |
Maybe worth changing the test to use "-mattr=+sve,+fullfp16", so the f16 tests work?
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
13 | "class imm8_opt_lsl" has some code which looks like it supposed to be used for matching. Granted, it's using ImmLeaf, so maybe it's just broken. |
Good idea. I'll do that.
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
13 | Oh, yeah, I see what you mean now. I'll dig into that... |
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
13 | Looking closer, this seems to be in line with the other ComplexPattern uses (e.g. SVEAddSubImm8Pat). The SVE8BitLslImm ComplexPattern is used to match the two i8 operands, $imm and $shift. If I tried to use cpy_imm8_opt_lsl_i64 in its place, I don't see a way to get the individual operands by name. E.g. something like: def : Pat<(nxv2i64 (AArch64dup (i64 (cpy_imm8_opt_lsl_i64 i64:$imm)))), (DUP_ZI_D $???, $???)>; |
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
13 | If you write something like SVE8BitLslImm:$a, it actually counts as two operands, I think, because that's what the ComplexPattern specifies? Not completely sure. Orthogonal to that, if we aren't going to use the predicates in cpy_imm8_opt_lsl_i64 etc., we should get rid of them. |
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
13 |
SVE8BitLslImm usage would be something like this: (SVE8BitLslImm i32:$a, i32:$b) where a is an immediate value and b is a shift amount. I think that's necessary, having two separate named immediates, since DUP_ZI_x expects two immediate operands. E.g. def : Pat<(nxv2i64 (AArch64dup (i64 (SVE8BitLslImm i32:$a, i32:$b)))), (DUP_ZI_D $a, $b)>;
We do use cpy_imm8_opt_lsl_i64 in DUP_ZI_x right now, but it's slightly different: def : InstAlias<"mov $Zd, $imm", (!cast<Instruction>(NAME # _D) ZPR64:$Zd, cpy_imm8_opt_lsl_i64:$imm), 1>; cpy_imm8_opt_lsl_i64 will match to (ops i32imm, i32imm), which are the two operands of the DUP_ZI_x. This seems desirable since the hardware wants a single immediate constructed from the 2 immediate parts. That is, we want a way to verify that the two immediates actually make sense together. All this is really at the edge of my TblGen understanding, so take with some skepticism... |
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
---|---|---|
13 | Looking a little more, I think I'm just wrong on the ComplexPattern thing.
That's not using the IntImm predicate; there's a separate asm operand parser function. |
Is there some reason you can't use the existing cpy_imm8_opt_lsl_i8?