This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add support for Fujitsu A64FX
ClosedPublic

Authored by kawashima-fj on Mar 3 2020, 11:12 PM.

Details

Summary

A64FX is used in FUJITSU Supercomputer PRIMEHPC FX1000, PRIMEHPC FX700,
and supercomputer Fugaku.

https://www.fujitsu.com/global/products/computing/servers/supercomputer/specifications/

Diff Detail

Event Timeline

kawashima-fj created this revision.Mar 3 2020, 11:12 PM
kawashima-fj updated this revision to Diff 248116.EditedMar 4 2020, 12:23 AM

An unnecessary comment line is removed and indentation is aligned.

dmgreen added a subscriber: dmgreen.

Sounds good. Should there be some clang tests? For example in clang/test/Driver/aarch64-cpus.c

The technical paper mentions dotprod. I presume it means SVE dotprod, not AEK_DOTPROD?

llvm/lib/Target/AArch64/AArch64Subtarget.cpp
94

A 32byte loop alignment sounds very high. Are you sure executing that many NOP's will be beneficial?

huntergr added a comment.EditedMar 4 2020, 6:46 AM

Looks good, I agree with @dmgreen that a clang driver test would be nice.

I think AEK_DOTPROD was introduced with 8.4 (and backported to 8.2 as an optional feature?) so I suspect the dot product support is just for SVE; it certainly isn't present in the cpuinfo feature flags.

As far as the loop alignment goes, would the A64FX benefit from planting an unconditional branch at the start of a series of alignment nops to skip actually executing them? (Not a change I'm requesting in this patch, just wondering if it would help with performance if we did have to plant lots of nops)

@huntergr @dmgreen Thanks for your reviews.

I added tests to clang/test/Driver/aarch64-cpus.c and clang/test/Preprocessor/aarch64-target-features.c.

Yes, you are correct. 'Dot product' in Fujitsu techincal papers denotes 'integer dot product' in SVE, not 'SIMD dot product' in ARMv8.2-DotProd.

My colleague measured SPEC CPU2017 with PrefLoopLogAlignment = 2 and 5. The performance effect varies depending on benchmarks, and we could determine the best parameter. I want to submit this patch with 5 and revisit it later (possibly with the scheduling model).

dmgreen accepted this revision.Mar 5 2020, 11:34 PM

I want to submit this patch with 5 and revisit it later (possibly with the scheduling model).

Sounds sensible. LGTM.

This revision is now accepted and ready to land.Mar 5 2020, 11:34 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2020, 3:44 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
ikitayama reopened this revision.Mar 17 2020, 3:28 PM
ikitayama added a subscriber: ikitayama.

Publicly available information on A64FX:

https://github.com/fujitsu/A64FX

This revision is now accepted and ready to land.Mar 17 2020, 3:28 PM

Yes, https://github.com/fujitsu/A64FX contains the official microarchitecture information of A64FX. I wanted to include the URL in the Git commit message but the disclosure was not ready for it at the time.

Yes, https://github.com/fujitsu/A64FX contains the official microarchitecture information of A64FX. I wanted to include the URL in the Git commit message but the disclosure was not ready for it at the time.

Can you do it at the next commit opportunity as this reference manual should be broadly read by the Arm developer community?

Yes, https://github.com/fujitsu/A64FX contains the official microarchitecture information of A64FX. I wanted to include the URL in the Git commit message but the disclosure was not ready for it at the time.

Can you do it at the next commit opportunity as this reference manual should be broadly read by the Arm developer community?

Sure. I'll do.

kawashima-fj closed this revision.Jan 15 2021, 4:32 AM

Yes, https://github.com/fujitsu/A64FX contains the official microarchitecture information of A64FX. I wanted to include the URL in the Git commit message but the disclosure was not ready for it at the time.

Can you do it at the next commit opportunity as this reference manual should be broadly read by the Arm developer community?

Done in b54337070b198cf66356a4ee3e420666151a2023 .

Matt added a subscriber: Matt.Jan 19 2021, 9:13 AM