The patch updates sched numbers for YMM AVX instrs such as VMOVx, VORx, VXOR, VPERMILx, VBROADCASTx, etc.
Details
Diff Detail
Event Timeline
lib/Target/X86/X86ScheduleBtVer2.td | ||
---|---|---|
552 | Don't include commented out code |
lib/Target/X86/X86ScheduleBtVer2.td | ||
---|---|---|
550 | Default latency = 1 - so remove and just leave ResourceCycles? | |
589 | This is masked load, and we need 128 bit versions: def WriteVMaskMovLd: SchedWriteRes<[JLAGU,JFPU01]> { let Latency = 6; } def : InstRW<[WriteVMaskMovLd], (instregex "VMASKMOVP(D|S)rm")>; def WriteVMaskMovYLd: SchedWriteRes<[JLAGU,JFPU01]> { let Latency = 6; let ResourceCycles = [1,2]; } def : InstRW<[WriteVMaskMovYLd], (instregex "VMASKMOVP(D|S)Yrm")>; | |
595 | This is masked store, and we need 128 bit versions: def WriteVMaskMovSt: SchedWriteRes<[JFPU01,JSAGU]> { let Latency = 6; } def : InstRW<[WriteVMaskMovSt], (instregex "VMASKMOVP(D|S)mr")>; def WriteVMaskMovYSt: SchedWriteRes<[JFPU01,JSAGU]> { let Latency = 6; let ResourceCycles = [2,1]; } def : InstRW<[WriteVMaskMovYSt], (instregex "VMASKMOVP(D|S)Ymr")>; | |
604 | 128-bit versions need fixing as well - they are Latency=3 too | |
612 | Shouldn't this be [1,1]? (i.e. default - so remove?) | |
614 | Add VPTESTYrr as well | |
618 | Where did you get [1, 4, 2] from ? Shouldn't this be [1,2,2]? | |
620 | VPTESTYrm as well |
Some numbers were changed, some new instructions were added, some model changes were done accordingly to Simon requirements.
lib/Target/X86/X86ScheduleBtVer2.td | ||
---|---|---|
552 | Drop this - AVX1 doesn't have rr broadcast instructions (just rm instructions) |
lib/Target/X86/X86ScheduleBtVer2.td | ||
---|---|---|
552 | But X86 Instr Info has such commands: it means we should open the corresponding bug, rught? |
Default latency = 1 - so remove and just leave ResourceCycles?