This patch introduces the following changes to the btver2 scheduling model:
The number of micro opcodes for YMM loads and stores is now 2 (it was incorrectly set to 1 for both aligned and misaligned loads/stores).
Increased the number of AGU resource cycles for YMM loads and stores to 2cy (instead of 1cy).
Removed JFPU01 and JFPX from the list of resources consumed by pure float/vector loads (no MMX).
I verified with llvm-exegesis that pure XMM/YMM loads are no-pipe. They are dispatched to the FPU but not really issues on JFPU01.