The patch updates sched numbers for YMM AVX instrs such as VMOVx, VORx, VXOR, VPERMILx, VBROADCASTx, etc.
Details
Diff Detail
Event Timeline
| lib/Target/X86/X86ScheduleBtVer2.td | ||
|---|---|---|
| 456 | Don't include commented out code | |
| lib/Target/X86/X86ScheduleBtVer2.td | ||
|---|---|---|
| 454 | Default latency = 1 - so remove and just leave ResourceCycles? | |
| 493 | This is masked load, and we need 128 bit versions: def WriteVMaskMovLd: SchedWriteRes<[JLAGU,JFPU01]> {
let Latency = 6;
}
def : InstRW<[WriteVMaskMovLd], (instregex "VMASKMOVP(D|S)rm")>;
def WriteVMaskMovYLd: SchedWriteRes<[JLAGU,JFPU01]> {
let Latency = 6;
let ResourceCycles = [1,2];
}
def : InstRW<[WriteVMaskMovYLd], (instregex "VMASKMOVP(D|S)Yrm")>; | |
| 499 | This is masked store, and we need 128 bit versions: def WriteVMaskMovSt: SchedWriteRes<[JFPU01,JSAGU]> {
let Latency = 6;
}
def : InstRW<[WriteVMaskMovSt], (instregex "VMASKMOVP(D|S)mr")>;
def WriteVMaskMovYSt: SchedWriteRes<[JFPU01,JSAGU]> {
let Latency = 6;
let ResourceCycles = [2,1];
}
def : InstRW<[WriteVMaskMovYSt], (instregex "VMASKMOVP(D|S)Ymr")>; | |
| 508 | 128-bit versions need fixing as well - they are Latency=3 too | |
| 516 | Shouldn't this be [1,1]? (i.e. default - so remove?) | |
| 518 | Add VPTESTYrr as well | |
| 522 | Where did you get [1, 4, 2] from ? Shouldn't this be [1,2,2]? | |
| 524 | VPTESTYrm as well | |
Some numbers were changed, some new instructions were added, some model changes were done accordingly to Simon requirements.
| lib/Target/X86/X86ScheduleBtVer2.td | ||
|---|---|---|
| 456 | Drop this - AVX1 doesn't have rr broadcast instructions (just rm instructions) | |
| lib/Target/X86/X86ScheduleBtVer2.td | ||
|---|---|---|
| 456 | But X86 Instr Info has such commands: it means we should open the corresponding bug, rught? | |
Default latency = 1 - so remove and just leave ResourceCycles?