Introduce basic schedule model for AMD Zen 3 CPU's, a.k.a znver3.
This is fully built from scratch, from llvm-mca measurements
and documented reference materials.
Nothing was copied from znver2/znver1.
I believe this is in a reasonable state of completion for inclusion,
probably better than D52779 bdver2 was :)
Namely:
- uops are pretty spot-on (at least what llvm-mca can measure)
- latency is also pretty spot-on (at least what llvm-mca can measure)
- throughput is within reason
I haven't run much benchmarks with this, but what i did run says this is beneficial:
I'll call out the obvious problems there:
- i didn't really bother with X87 instructions
- i didn't really bother with obviously-microcoded/system instructions
- There are large discrepancy in throughput for mr and rm instructions. I'm not really sure if it's a modelling defect that needs to be fixed, or it's a defect of measurments.
- Pipe distributions are probably bad :) I can't do much here until AMD allows that to be fixed by documenting the appropriate counters and updating libpfm
That being said, as @RKSimon notes:
so how much worse this could possibly be?!
Things that aren't there:
- Various tunings: zero idioms, etc. That is follow-ups.
Do you have the fpu pipe assignments for znver3 (or znver2 for that matter?)