This patch will close PR32801. It covers both ssse3 and avx versions.
Details
Diff Detail
Event Timeline
This needs to be done in general, not just for Jaguar. Please can you add WriteFHAdd and WriteVecHAdd defs in X86Schedule.td, and then tag the relevant instructions in X86InstrSSE.td and X86InstrAVX512.td. Then in ScheduleBtVer2.td you need to add instances of the 2 defs and special case the ymm versions. Either add TODOs for the other x86 models or add them if you want to dig through Agner's tables.
I redesigned the implementation accordingly to Simon requirements. Now it's done in general way and every X86 should support horizontal operations modeling. I did not check the numbers for SB and SLM: I simply kept the current ones. And I separated Ymm version from Xmm version to be able to model the corresponding throughput difference for Jaguar.
Adding some Intel guys to look at the SLM/SB/HW model changes.
lib/Target/X86/X86InstrSSE.td | ||
---|---|---|
5194 ↗ | (On Diff #100683) | Move on to previous line and re-add ReadAfterLd |
lib/Target/X86/X86SchedSandyBridge.td | ||
164 ↗ | (On Diff #100683) | Isn't this comment out of date? You're modelling the horizontal operations below. |
lib/Target/X86/X86Schedule.td | ||
82 ↗ | (On Diff #100683) | I'm not sure we should be introducing size specific versions - we don't do this for any other cases. Merge back into WriteHAdd? |
lib/Target/X86/X86ScheduleBtVer2.td | ||
68 | You shouldn't need this - just use JFPU01 | |
lib/Target/X86/X86ScheduleSLM.td | ||
144 ↗ | (On Diff #100683) | Isn't this comment out of date? You're modelling the horizontal operations below. |
lib/Target/X86/X86Schedule.td | ||
---|---|---|
82 ↗ | (On Diff #100683) | I did it specially because Jaguar has different numbers for Ymm and Xmm and we can't model such difference without the special SchedWrite. Is it OK? |
lib/Target/X86/X86ScheduleSLM.td | ||
---|---|---|
144 ↗ | (On Diff #100683) | In fact we did not model SLM here. I simply kept the current numbers: they could be wrong. |
lib/Target/X86/X86SchedSandyBridge.td | ||
---|---|---|
169 ↗ | (On Diff #100683) | This isn't needed I think as 'ResourceCycles = [1]' is default anyways. |
lib/Target/X86/X86SchedHaswell.td | ||
---|---|---|
1530 ↗ | (On Diff #101039) | Please see the accurate modeling I added for these instrs in https://reviews.llvm.org/D33897 |
1538 ↗ | (On Diff #101039) | Please see the accurate modeling I added for these instrs in https://reviews.llvm.org/D33897 |
1901 ↗ | (On Diff #101039) | Please see the accurate modeling I added for these instrs in https://reviews.llvm.org/D33897 |
1903 ↗ | (On Diff #101039) | Please see the accurate modeling I added for these instrs in https://reviews.llvm.org/D33897 |
lib/Target/X86/X86ScheduleBtVer2.td | ||
---|---|---|
335 | The 'let ResourceCycles = [1,1] ' is default and so redundant. You could shorten your patch by removing it if you like. |
You shouldn't need this - just use JFPU01