The mfma (matrix fused multiply add) instructions present on some
AMDGPUs provide hardware support for particular matrix multiplication
sies and formats.
In LVVM, these operations are exposed via intrinsics. In order to make
their usage in MLIR more ergonomic, we define a amdgpu.mfma
operation that takes a MFMAInstr enum to specify which instruction
should be used. This allows higher-level code to select the mfma
operation to be used by changing an enum value instead of by selecting
a different operation, improving the ergonomics of generating matrix
multiplication kernels.
The amdgpu.mfma operation also allows operations that logically take
vectors of bytes as inputs, instead of requiring, as LLVM does, that
the inputs be concatenated into an i32 or i64.