Automatically generated by D130897. This fixed issue #58792.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/X86/X86SchedAlderlakeP.td | ||
---|---|---|
798–801 | CVTSI642SSrr has different schedreadwrite with CVTSI642SSrr_Int? |
llvm/lib/Target/X86/X86SchedAlderlakeP.td | ||
---|---|---|
798–801 | It has been a while since I looked at that .td file. However, I think that the main difference is that some variants accept two operands, while other variants use three-operands forms. More specifically, _Int variants tend to use three operands. CVTSI642SS is a codegen only instance of multiclass sse12_cvt_s, which assumes two input operands: def rr : SI<opc, MRMSrcReg, (outs DstRC:$dst), (ins SrcRC:$src), !strconcat(asm,"\t{$src, $dst|$dst, $src}"), [(set DstRC:$dst, (OpNode SrcRC:$src))]>, Sched<[sched, Int2Fpu]>; That's why we don't need to skip the first use by introducing an "extra" ReadDefault. If you look at the tablegen'd definitions, you see this: { 909, 2, 1, 0, 613, 0|(1ULL<<MCID::MayRaiseFPException), 0x1508023029ULL, ImplicitList18, nullptr, OperandInfo205 }, // Inst #909 = CVTSI642SSrr { 910, 3, 1, 0, 614, 0|(1ULL<<MCID::MayRaiseFPException), 0x1508023029ULL, ImplicitList18, nullptr, OperandInfo204 }, // Inst #910 = CVTSI642SSrr_Int Note how CVTSI642SSrr takes 2 operands (second field), but the _Int variant takes 3 operands. |
llvm/lib/Target/X86/X86SchedAlderlakeP.td | ||
---|---|---|
798–801 | multiclass sse12_cvt_sint_3addr<bits<8> opc, RegisterClass SrcRC, RegisterClass DstRC, X86MemOperand x86memop, string asm, string mem, X86FoldableSchedWrite sched, Domain d, bit Is2Addr = 1> { let hasSideEffects = 0, ExeDomain = d in { def rr_Int : SI<opc, MRMSrcReg, (outs DstRC:$dst), (ins DstRC:$src1, SrcRC:$src2), !if(Is2Addr, !strconcat(asm, "\t{$src2, $dst|$dst, $src2}"), !strconcat(asm, "\t{$src2, $src1, $dst|$dst, $src1, $src2}")), []>, Sched<[sched, ReadDefault, ReadInt2Fpu]>; let mayLoad = 1 in def rm_Int : SI<opc, MRMSrcMem, (outs DstRC:$dst), (ins DstRC:$src1, x86memop:$src2), !if(Is2Addr, asm#"{"#mem#"}\t{$src2, $dst|$dst, $src2}", asm#"{"#mem#"}\t{$src2, $src1, $dst|$dst, $src1, $src2}"), []>, Sched<[sched.Folded, sched.ReadAfterFold]>; } } Should we use Sched<[sched, ReadInt2Fpu] if Is2Addr == 1? |
llvm/lib/Target/X86/X86SchedAlderlakeP.td | ||
---|---|---|
798–801 | I'd think there's some problem when sharing sse12_cvt_sint_3addr between SSE and AVX, though we used Is2Addr to select the assemble. We mistakenly assigned 3 operands to the SSE instructions. This is unexpected since SSE instructions cannot use 3 operands. |
llvm/lib/Target/X86/X86SchedAlderlakeP.td | ||
---|---|---|
798–801 | Changing it would be incorrect. The fact that some of variants are generated using two-address forms, doesn't change the fact that the machine instruction has three operands. let Constraints = "$src1 = $dst" LLVM would still model those instructions as having three operands, with the difference that the first input register must match the destination register. The assembly string would not print the first input operand. Example: { 902, 3, 1, 0, 1150, 0|(1ULL<<MCID::MayRaiseFPException), 0x1508003029ULL, ImplicitList18, nullptr, OperandInfo201 }, // Inst #902 = CVTSI2SSrr_Int It has three operands, even though flag Is2Addr is set. I hope it helps |
I had an offline communication with Haohai. The change to the CodeGenOnly MIs LGTM. Please wait one or two days for other reviewers.
why ((_NT)?) and not (_NT)?