I started looking into SVE support, and it seems that there is a significant amount of work that needs to be done. We are planning to spend some time to help with SVE support, but we are not quite sure where to start. So I thought to start with a very small patch that helps to understand the current status of SVE support. Also to see if there are any suggestions on how we can help more effectively with the development of SVE.
Diff Detail
Event Timeline
Hi!
Thanks for taking an interest in SVE codegen. The tests look fine, but we had a different approach in mind for generating many of the patterns. In this case instead of making separate def : Pat< lines for each type, we add an SDPatternOperator and pattern to the multiclass for that instruction type.
So FADD_ZZZ is an instance of sve_fp_3op_u_zd, and by changing that definition in SVEInstrFormats.td to look like the following code we create matchers for all the instructions using that template.
class sve_fp_3op_u_zd<bits<2> sz, bits<3> opc, string asm, ZPRRegOp zprty, ValueType vt, ValueType vt2, SDPatternOperator op> : I<(outs zprty:$Zd), (ins zprty:$Zn, zprty:$Zm), asm, "\t$Zd, $Zn, $Zm", "", [(set (vt zprty:$Zd), (op (vt zprty:$Zn), (vt2 zprty:$Zm)))]>, Sched<[]> { bits<5> Zd; bits<5> Zm; bits<5> Zn; let Inst{31-24} = 0b01100101; let Inst{23-22} = sz; let Inst{21} = 0b0; let Inst{20-16} = Zm; let Inst{15-13} = 0b000; let Inst{12-10} = opc; let Inst{9-5} = Zn; let Inst{4-0} = Zd; } multiclass sve_fp_3op_u_zd<bits<3> opc, string asm, SDPatternOperator op> { def _H : sve_fp_3op_u_zd<0b01, opc, asm, ZPR16, nxv8f16, nxv8f16, op>; def _S : sve_fp_3op_u_zd<0b10, opc, asm, ZPR32, nxv4f32, nxv4f32, op>; def _D : sve_fp_3op_u_zd<0b11, opc, asm, ZPR64, nxv2f64, nxv2f64, op>; }
The instruction instantiations in AArch64SVEInstrInfo.td would then need to pass in the correct SDPatternOperator, with a null_frag operator for the others right now so that you don't add untested matchers:
defm FADD_ZZZ : sve_fp_3op_u_zd<0b000, "fadd", fadd>; defm FSUB_ZZZ : sve_fp_3op_u_zd<0b001, "fsub", null_frag>; defm FMUL_ZZZ : sve_fp_3op_u_zd<0b010, "fmul", null_frag>; ....
Please let me know if my explanation isn't clear enough :)
Thanks @huntergr. Will update the patch. Once I am sure we got this one right, we can look into others.
llvm/lib/Target/AArch64/SVEInstrFormats.td | ||
---|---|---|
1218 | Just realized that I forgot to check if we really need two distinct ValueType here or not. I will check that and remove it if not needed. |
LGTM.
You can add the extra multiclass for ftsmul if you wish, but it's not needed until someone implements a matching pattern for that instruction. We only match it against an ACLE intrinsic downstream, not common SDag nodes.
llvm/lib/Target/AArch64/SVEInstrFormats.td | ||
---|---|---|
1218 | I guess I left a bit out; the reason for the second valuetype was for the ftsmul instruction, which takes a vector of integers as one of the arguments. We added a second multiclass just for that instruction: multiclass sve_fp_3op_u_zd_ftsmul<bits<3> opc, string asm, SDPatternOperator op> { def _H : sve_fp_3op_u_zd<0b01, opc, asm, ZPR16, nxv8f16, nxv8i16, op>; def _S : sve_fp_3op_u_zd<0b10, opc, asm, ZPR32, nxv4f32, nxv4i32, op>; def _D : sve_fp_3op_u_zd<0b11, opc, asm, ZPR64, nxv2f64, nxv2i64, op>; } |
Just realized that I forgot to check if we really need two distinct ValueType here or not. I will check that and remove it if not needed.