Fugaku supercomputer is built with the Fujitsu A64FX microprocessor, whose cache line is 256. In current libomp, we only have cache line size 128 for PPC64 and otherwise 64. This patch added the support of cache line 256 for A64FX. It's worth noting that although A64FX is a variant of AArch64, this property is not shared. As a result, in light of UCX source code (https://github.com/openucx/ucx/blob/392443ab92626412605dee1572056f79c897c6c3/src/ucs/arch/aarch64/cpu.c#L17), we can only determine by checking whether the CPU is FUJITSU A64FX.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Why is it necessary to write and compile a C program just to parse /proc/cpuinfo? Can this be done directly from CMake?
I understand that the predefined variables don't work, but we can surely parse /proc/cpuinfo with native CMake commands instead of compiling a C program, no?
openmp/runtime/cmake/LibompGetArchitecture.cmake | ||
---|---|---|
78–81 | Can you use TRUE and FALSE here? This also avoids the overly generic MATCHES "1" at call site. |
I'm going to revert this as it breaks CMake on systems which do not have /proc/cpuinfo such as macOS.
This may be a bit hard to see because the code isn't reached unless the architecture is aarch64, but on an ARM macOS system that path hits. It would also hit on other BSDs or other OSes running on AArch64 but without /proc/cpuinfo.
For your reference, here is the error message from CMake for me:
CMake Error at /Users/chandlerc/src/llvm/llvm-project/openmp/runtime/cmake/LibompGetArchitecture.cmake:74 (file): file failed to open for reading (No such file or directory): /proc/cpuinfo Call Stack (most recent call first): /Users/chandlerc/src/llvm/llvm-project/openmp/runtime/CMakeLists.txt:73 (libomp_is_aarch64_a64fx)
I didn't think my CMake fu was up to it, but I think I have a fix, WDYT: https://reviews.llvm.org/D94889
And thanks to the speedy review, landed fix in: https://github.com/llvm/llvm-project/commit/f855751c1284c82c1c46b98f6d1b3ca2021d6cb9
Can you use TRUE and FALSE here? This also avoids the overly generic MATCHES "1" at call site.