This patch includes proper schedule numbers for F16C instructions on btver2 CPU.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
| lib/Target/X86/X86ScheduleBtVer2.td | ||
|---|---|---|
| 395 ↗ | (On Diff #119471) | I think 'ResourceCycles = [1];' is the default so you should be able to drop this? |
| 403 ↗ | (On Diff #119471) | Split off the cvtph2ps instructions from cvtps2ph - the load/store cases are definitely different and it makes little sense to keep the rr cases together. |
| 405 ↗ | (On Diff #119471) | Shouldn't the JFPU1 case be JFPU01? The amd docs say 'STC,FPA|FPM' |
| 412 ↗ | (On Diff #119471) | Shouldn't the JFPU1 case be JFPU01? |
| 426 ↗ | (On Diff #119471) | cvtph2ps is a load, so the JLAGU is the first stage not the last. |
| test/CodeGen/X86/f16c-schedule.ll | ||
| 48 ↗ | (On Diff #119471) | This should be [8:1.00] |
| 106 ↗ | (On Diff #119471) | These should be [8:2.00] and [3:2.00]? |
| lib/Target/X86/X86ScheduleBtVer2.td | ||
|---|---|---|
| 405 ↗ | (On Diff #119471) | It's again a difference between Agner and AMD docs :-( |
Stores are tricky as we can't easily model the time until the value is written out to memory - best we can do is just the cycles for the conversion, and it then disappears into the memory queue. It means that spill/reload round trips can't be easily modelled but then we don't handle STLF timings either.
LGTM with one minor.
| lib/Target/X86/X86ScheduleBtVer2.td | ||
|---|---|---|
| 438 ↗ | (On Diff #120000) | This should be Latency = 8 |