- Missing instruction information (FTSSEL, FMSB, PFIRST and RDFFR) is added and CompleteModel is set to one.
- Information for pseudo SVE instructions is added. Those instructions are present at the time of scheduling.
- Resource and latency information for SVE instructions is modified to be more accurate. For example, the description for CMPEQ, which consumes one cycle each of unit FLA and PPR, is as follows.
Previous: def A64FXGI01 : ProcResGroup<[A64FXIPFLA, A64FXIPPR]>; def A64FXWrite_4Cyc_GI01 : SchedWriteRes<[A64FXGI01]> {... Modified: def A64FXGI0 : ProcResGroup<[A64FXIPFLA]>; def A64FXGI1 : ProcResGroup<[A64FXIPPR]>; def A64FXWrite_CMP : SchedWriteRes<[A64FXGI0, A64FXGI1]> {...
Reference: A64FX Microarchitecture Manual (Table 16-3)
https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Microarchitecture_Manual_en_1.7.pdf
Although this patch will significantly change the resource/latency information for SVE instructions, I believe that the change in scheduling results will be small. This is because minimizing register pressure is a priority in scheduling for this machine. (According to GenericScheduler::tryCandidate() at MachineScheduler.cpp)
I would like to implement more aggressive scheduling (such as pipelining) and this modification is necessary for that.
It might be worth removing some of these duplicate fmov's. I've done the same for the N2 schedule tests in rG408378a0b3b0