A64FX is used in FUJITSU Supercomputer PRIMEHPC FX1000, PRIMEHPC FX700,
and supercomputer Fugaku.
https://www.fujitsu.com/global/products/computing/servers/supercomputer/specifications/
Paths
| Differential D75594
[AArch64] Add support for Fujitsu A64FX ClosedPublic Authored by kawashima-fj on Mar 3 2020, 11:12 PM.
Details Summary A64FX is used in FUJITSU Supercomputer PRIMEHPC FX1000, PRIMEHPC FX700, https://www.fujitsu.com/global/products/computing/servers/supercomputer/specifications/
Diff Detail
Event TimelineHerald added subscribers: llvm-commits, jfb, hiraditya, kristof.beyls. · View Herald TranscriptMar 3 2020, 11:12 PM Comment Actions Sounds good. Should there be some clang tests? For example in clang/test/Driver/aarch64-cpus.c The technical paper mentions dotprod. I presume it means SVE dotprod, not AEK_DOTPROD?
Comment Actions Looks good, I agree with @dmgreen that a clang driver test would be nice. I think AEK_DOTPROD was introduced with 8.4 (and backported to 8.2 as an optional feature?) so I suspect the dot product support is just for SVE; it certainly isn't present in the cpuinfo feature flags. As far as the loop alignment goes, would the A64FX benefit from planting an unconditional branch at the start of a series of alignment nops to skip actually executing them? (Not a change I'm requesting in this patch, just wondering if it would help with performance if we did have to plant lots of nops) Comment Actions @huntergr @dmgreen Thanks for your reviews. I added tests to clang/test/Driver/aarch64-cpus.c and clang/test/Preprocessor/aarch64-target-features.c. Yes, you are correct. 'Dot product' in Fujitsu techincal papers denotes 'integer dot product' in SVE, not 'SIMD dot product' in ARMv8.2-DotProd. My colleague measured SPEC CPU2017 with PrefLoopLogAlignment = 2 and 5. The performance effect varies depending on benchmarks, and we could determine the best parameter. I want to submit this patch with 5 and revisit it later (possibly with the scheduling model). Comment Actions
Sounds sensible. LGTM. This revision is now accepted and ready to land.Mar 5 2020, 11:34 PM Closed by commit rGc8cd1a994d28: [AArch64] Add support for Fujitsu A64FX (authored by kawashima-fj). · Explain WhyMar 9 2020, 3:44 AM This revision was automatically updated to reflect the committed changes. This revision is now accepted and ready to land.Mar 17 2020, 3:28 PM Comment Actions Yes, https://github.com/fujitsu/A64FX contains the official microarchitecture information of A64FX. I wanted to include the URL in the Git commit message but the disclosure was not ready for it at the time. Comment Actions
Can you do it at the next commit opportunity as this reference manual should be broadly read by the Arm developer community? Comment Actions
Sure. I'll do. Comment Actions
Done in b54337070b198cf66356a4ee3e420666151a2023 .
Revision Contents
Diff 249051 clang/test/Driver/aarch64-cpus.c
clang/test/Preprocessor/aarch64-target-features.c
llvm/include/llvm/Support/AArch64TargetParser.def
llvm/lib/Support/Host.cpp
llvm/lib/Target/AArch64/AArch64.td
llvm/lib/Target/AArch64/AArch64Subtarget.h
llvm/lib/Target/AArch64/AArch64Subtarget.cpp
llvm/test/CodeGen/AArch64/cpus.ll
llvm/test/CodeGen/AArch64/preferred-function-alignment.ll
llvm/unittests/Support/Host.cpp
llvm/unittests/Support/TargetParserTest.cpp
|
A 32byte loop alignment sounds very high. Are you sure executing that many NOP's will be beneficial?