This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Fix and add A64FX scheduling resource/latency info
ClosedPublic

Authored by ytmukai on Aug 4 2022, 7:07 AM.

Details

Summary
  1. Missing instruction information (FTSSEL, FMSB, PFIRST and RDFFR) is added and CompleteModel is set to one.
  1. Information for pseudo SVE instructions is added. Those instructions are present at the time of scheduling.
  1. Resource and latency information for SVE instructions is modified to be more accurate. For example, the description for CMPEQ, which consumes one cycle each of unit FLA and PPR, is as follows.
Previous:
  def A64FXGI01 : ProcResGroup<[A64FXIPFLA, A64FXIPPR]>;
  def A64FXWrite_4Cyc_GI01 : SchedWriteRes<[A64FXGI01]> {...
Modified:
  def A64FXGI0 : ProcResGroup<[A64FXIPFLA]>;
  def A64FXGI1 : ProcResGroup<[A64FXIPPR]>;
  def A64FXWrite_CMP : SchedWriteRes<[A64FXGI0, A64FXGI1]> {...

Reference: A64FX Microarchitecture Manual (Table 16-3)
https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Microarchitecture_Manual_en_1.7.pdf

Although this patch will significantly change the resource/latency information for SVE instructions, I believe that the change in scheduling results will be small. This is because minimizing register pressure is a priority in scheduling for this machine. (According to GenericScheduler::tryCandidate() at MachineScheduler.cpp)

I would like to implement more aggressive scheduling (such as pipelining) and this modification is necessary for that.

Diff Detail

Event Timeline

ytmukai created this revision.Aug 4 2022, 7:07 AM
ytmukai requested review of this revision.Aug 4 2022, 7:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 4 2022, 7:07 AM
Matt added a subscriber: Matt.Aug 4 2022, 8:35 AM

Thanks @ytmukai .

@c-rhodes This patch includes a follow-up to your comment D128631#3618052 . @ytmukai is my colleague. Most other changes are refinements.

The failure of the buildbot is irrelevant to this commit. Other patches have the same failure.

dmgreen accepted this revision.Aug 5 2022, 12:30 AM

I don't know the details of this core, I am trusting you on that, but from an LLVM perspective this looks good. The tests are nice to have too, thanks for adding them. I have a small suggestion on reducing some of the duplicates, but otherwise LGTM.

llvm/test/tools/llvm-mca/AArch64/A64FX/A64FX-sve-instructions.s
632

It might be worth removing some of these duplicate fmov's. I've done the same for the N2 schedule tests in rG408378a0b3b0

This revision is now accepted and ready to land.Aug 5 2022, 12:30 AM
kawashima-fj accepted this revision.Aug 7 2022, 7:05 PM

LGTM. I talked with @ytmukai and he explained to me how he updated the info. Though I didn't review all the updated lines, his approach is reasonable.

ytmukai updated this revision to Diff 450695.Aug 7 2022, 9:55 PM

Resolve the review comment and slightly modify a comment line.

ytmukai marked an inline comment as done.Aug 7 2022, 11:46 PM

Thanks for the reviews! I fixed the duplication. @kawashima-fj Please land this patch.

llvm/test/tools/llvm-mca/AArch64/A64FX/A64FX-sve-instructions.s
632

I removed the duplicates similarly.

ytmukai marked an inline comment as done.Aug 7 2022, 11:47 PM
This revision was landed with ongoing or failed builds.Aug 8 2022, 7:01 PM
This revision was automatically updated to reflect the committed changes.
llvm/lib/Target/AArch64/AArch64SchedA64FX.td