This patch includes the a subset of MMA intrinsics that are included in the mma intrinsic module:
mma_assemble_acc
mma_assemble_pair
mma_build_acc
mma_disassemble_acc
mma_disassemble_pair
Paths
| Differential D155725
[flang] Add a subset of PowerPC MMA (Matrix Multiply Accelerate) intrinsics ClosedPublic Authored by DanielCChen on Jul 19 2023, 9:43 AM.
Details Summary This patch includes the a subset of MMA intrinsics that are included in the mma intrinsic module: mma_assemble_acc
Diff Detail
Event TimelineComment Actions I bet that the mma ops are only available on newer ppc cores and not on the old Apple ones. I could not find a feature check. Comment Actions
Correct. The MMA intrinsics are for Power 10 only. It is documented in IBM's compiler Language Reference Comment Actions I wanted to say that I could try to use them on an Apple and the compiler would crash somewhere, see https://reviews.llvm.org/D152131. Comment Actions
Right. There is a pending work item that will add semantic checking of Power10 cpu for these MMA intrinsics. Because there are some vector intrinsics that are for Power 10 only as well. We deferred the work until we have all MMA and vector intrinsics in. Comment Actions Attempt to fix the x64 Window build failure that failed due to linker error as: FIRBuilder.lib(PPCIntrinsicCall.cpp.obj) : error LNK2019: unresolved external symbol "void __cdecl Fortran::common::die(char const *,...)" (?die@common@Fortran@@YAXPEBDZZ) referenced in function "class mlir::FunctionType __cdecl fir::getMmaIrFuncType(class mlir::MLIRContext *,enum fir::MMAOp)" (?getMmaIrFuncType@fir@@YA?AVFunctionType@mlir@@PEAVMLIRContext@3@W4MMAOp@1@@Z) bin\tco.exe : fatal error LNK1120: 1 unresolved externals Comment Actions Fixing a build failure using clang as the build compiler. lib/Optimizer/Builder/PPCIntrinsicCall.cpp:1100:3: error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]
DanielCChen added inline comments.
DanielCChen marked an inline comment as done. Comment ActionsAddress review comments to remove unrelated changes. DanielCChen marked 2 inline comments as done. Comment ActionsIt seems the module and the test are missing from the last update. Put them back in after rebase. Comment Actions Could you please get the pwr10 check in before continuing to add more unchecked intrinsics. Comment Actions
Why is this important to you? This is a compiler under active development that is not yet suitable for production use. Comment Actions
The pwr10 check that covers all PowerPC intrinsics in this patch is in https://reviews.llvm.org/D156243. Comment Actions
The buildbots run the test suite from trunk and no buildbot caught the fact that there are issues trying to run the power10 intrinsics on a power9. There are missing tests and/or checks. Comment Actions
These Power10 intrinsic test in this patch and the other one (to be landed) has "-mcpu=pwr10" specified. It also only dumps the LLRM IR. In theory, they should not run on Power9. That being said, as I mentioned in the previous note, I put the semantic checking code in a a different patch, which should issue a compile time error message for unsupported cpu option. This revision is now accepted and ready to land.Jul 26 2023, 7:29 AM Comment Actions
Thanks. I will wait one more day in case there are more comments. Closed by commit rG2c2d427ca3a5: [flang] Add a subset of PowerPC MMA (Matrix Multiply Accelerate) intrinsics (authored by kkwli0). · Explain WhyJul 27 2023, 11:36 AM This revision was automatically updated to reflect the committed changes. Comment Actions Builds are failing to compile this code in PPCIntrinsicCall.cpp: const char *getMmaIrIntrName(MMAOp mmaOp) { switch (mmaOp) { case MMAOp::AssembleAcc: return "llvm.ppc.mma.assemble.acc"; case MMAOp::AssemblePair: return "llvm.ppc.vsx.assemble.pair"; case MMAOp::DisassembleAcc: return "llvm.ppc.mma.disassemble.acc"; case MMAOp::DisassemblePair: return "llvm.ppc.vsx.disassemble.pair"; } } mlir::FunctionType getMmaIrFuncType(mlir::MLIRContext *context, MMAOp mmaOp) { switch (mmaOp) { case MMAOp::AssembleAcc: return genMmaVqFuncType(context, /*Quad*/ 0, /*Pair*/ 0, /*Vector*/ 4); case MMAOp::AssemblePair: return genMmaVpFuncType(context, /*Quad*/ 0, /*Pair*/ 0, /*Vector*/ 2); case MMAOp::DisassembleAcc: return genMmaDisassembleFuncType(context, mmaOp); case MMAOp::DisassemblePair: return genMmaDisassembleFuncType(context, mmaOp); } } with the message flang/lib/Optimizer/Builder/PPCIntrinsicCall.cpp:1288:1: error: control reaches end of non-void function [-Werror=return-type] These two switches probably need an error default. Comment Actions
@vdonaldson Thanks for bringing it up! I am looking into it now.
Revision Contents
Diff 544872 flang/include/flang/Optimizer/Builder/PPCIntrinsicCall.h
flang/lib/Optimizer/Builder/IntrinsicCall.cpp
flang/lib/Optimizer/Builder/PPCIntrinsicCall.cpp
flang/lib/Optimizer/Dialect/FIROps.cpp
flang/lib/Semantics/check-call.cpp
flang/lib/Semantics/semantics.cpp
flang/module/mma.f90
flang/test/Lower/PowerPC/ppc-mma-assemble-disassemble.f90
flang/tools/f18/CMakeLists.txt
|
I don't think this is a relevant change for this patch.