This patch upstreams support for BFloat Matrix Multiplication Intrinsics
and Code Generation from __bf16 to AArch64. This includes IR intrinsics. Unittests are
provided as needed. AArch32 Intrinsics + CodeGen will come after this
patch.
This patch is part of a series implementing the Bfloat16 extension of
the
Armv8.6-a architecture, as detailed here:
The bfloat type, and its properties are specified in the Arm
Architecture
Reference Manual:
The following people contributed to this patch:
- Luke Geeson
- Momchil Velikov
- Mikhail Maltsev
- Luke Cheeseman
This chunk does not belong to the patch