This patch improves sched models for XCHG*rr X86 instrs removing unnecessary redefinitions of instr infos. This patch is based on results of D48222.
Details
Diff Detail
Event Timeline
I think this needs mca test coverage first.
llvm/test/tools/llvm-mca/X86$ grep -ri XCHG | wc -l 0
But we have a lot of XCHG tests in other places. For example:
.../llvm/test/CodeGen/X86$ grep -ri XCHG | wc -l
730
Or:
schedule-x86_32.ll:; ATOM-NEXT: xchgl %eax, %eax # sched: [2:1.00]
schedule-x86_32.ll:; ATOM-NEXT: xchgl %ecx, %eax # sched: [2:1.00]
schedule-x86_32.ll:; ATOM-NEXT: xchgl %eax, (%edx) # sched: [3:1.50]
schedule-x86_32.ll:; SLM-LABEL: test_xchg_32:
schedule-x86_32.ll:; SLM-NEXT: xchgl %eax, %eax # sched: [1:0.50]
schedule-x86_32.ll:; SLM-NEXT: xchgl %ecx, %eax # sched: [1:0.50]
schedule-x86_32.ll:; SLM-NEXT: xchgl %eax, (%edx) # sched: [4:2.00]
schedule-x86_32.ll:; SANDY-LABEL: test_xchg_32:
schedule-x86_32.ll:; SANDY-NEXT: xchgl %eax, %eax # sched: [2:1.00]
schedule-x86_32.ll:; SANDY-NEXT: xchgl %ecx, %eax # sched: [2:1.00]
schedule-x86_32.ll:; SANDY-NEXT: xchgl %eax, (%edx) # sched: [6:1.00]
schedule-x86_32.ll:; HASWELL-LABEL: test_xchg_32:
schedule-x86_32.ll:; HASWELL-NEXT: xchgl %eax, %eax # sched: [2:0.75]
schedule-x86_32.ll:; HASWELL-NEXT: xchgl %ecx, %eax # sched: [2:0.75]
schedule-x86_32.ll:; HASWELL-NEXT: xchgl %eax, (%edx) # sched: [9:1.00]
schedule-x86_32.ll:; BROADWELL-LABEL: test_xchg_32:
schedule-x86_32.ll:; BROADWELL-NEXT: xchgl %eax, %eax # sched: [2:0.75]
schedule-x86_32.ll:; BROADWELL-NEXT: xchgl %ecx, %eax # sched: [2:0.75]
schedule-x86_32.ll:; BROADWELL-NEXT: xchgl %eax, (%edx) # sched: [8:1.00]
schedule-x86_32.ll:; SKYLAKE-LABEL: test_xchg_32:
schedule-x86_32.ll:; SKYLAKE-NEXT: xchgl %eax, %eax # sched: [2:0.75]
schedule-x86_32.ll:; SKYLAKE-NEXT: xchgl %ecx, %eax # sched: [2:0.75]
schedule-x86_32.ll:; SKYLAKE-NEXT: xchgl %eax, (%edx) # sched: [10:1.25]
And this patch does not change any of those tests - it's very importent.
But of course you're right: we need such mca tests. I'll try to prepare such tests for XCHG* and CMPXCHG* instrs (I've just started to work with CMPXCHG*).
In general - looks good, assuming that after rebasing ontop of D49912 there are no regressions.
A few nits.
The SLM & btver2 - are those placeholders, or those are real values from agner?
lib/Target/X86/X86SchedHaswell.td | ||
---|---|---|
128 | Nit: align? | |
lib/Target/X86/X86SchedSkylakeClient.td | ||
114 | Nit: align? | |
lib/Target/X86/X86Schedule.td | ||
122 | // Exchange the contents of two operands | |
lib/Target/X86/X86ScheduleBtVer2.td | ||
172 | These are from agner? |
They did not have any notes about XCHG*rr instrs and that means that default values were OK for them. I kept the default values and tests were OK but it's better to check with agner of course. I'll do it.
lib/Target/X86/X86ScheduleBtVer2.td | ||
---|---|---|
172 | Agner shows different values for r8,r8 and r,r while AMD64_16h_InstrLatency_1.1 shows 'ucode' for all variants. I used the default values because it keeps the current tests untouched. That's our old problem: what should be selected as "right" value? |
lib/Target/X86/X86ScheduleBtVer2.td | ||
---|---|---|
172 | Ideally - what llvm-exegesis says for these. |
If you want to rename them that's OK - but XADD is just a form of XCHG (it just happens to perform the sum as well).
XADD*rr instrs were added: I did not add rm version because we should decide how to do it beter:
like in D50070
or with help of WriteSequence suggested by craig.topper
lib/Target/X86/X86Schedule.td | ||
---|---|---|
122 | Add // Compare+Exchange - TODO RMW support. |
The comment was added:
def WriteXCHG : SchedWrite; // Compare+Exchange - TODO RMW support.
Nit: align?