Page MenuHomePhabricator

[BFloat] Add convert/copy instrinsic support
ClosedPublic

Authored by miyuki on Jun 1 2020, 8:05 AM.

Details

Summary

This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

Specifically it adds intrinsic support in clang and llvm for Arm and AArch64.

The bfloat type, and its properties are specified in the Arm Architecture Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

  • Alexandros Lamprineas
  • Luke Cheeseman
  • Mikhail Maltsev
  • Momchil Velikov

Diff Detail

Event Timeline

labrinea created this revision.Jun 1 2020, 8:05 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 1 2020, 8:05 AM

A few nits, it looks largely uncontroversial but I'll let someone else give the ok since an external eye is always good :)

clang/utils/TableGen/NeonEmitter.cpp
1064

I always wondered why we need special treatment for cvt intrinsics here, is there a better place to put this?

llvm/test/CodeGen/ARM/bf16-intrinsics-nofp16.ll
1 ↗(On Diff #267620)

These files are very small, it's almost not worth having them on their own, can you merge them into 1 file unless we need 2 for a specific reason

llvm/test/CodeGen/ARM/bf16-intrinsics.ll
1 ↗(On Diff #267620)

same here

miyuki added inline comments.Jun 4 2020, 3:12 AM
llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp
4765

Why does it return MVT::v4i16, not MVT::v4bf16?

ostannard added inline comments.
llvm/include/llvm/IR/IntrinsicsARM.td
803

I only see this being used for f32 -> bf16 conversion, so could this have concrete types, instead of llvm_anyvector_ty?

momchil.velikov resigned from this revision.Jun 8 2020, 7:15 AM
miyuki commandeered this revision.Wed, Jun 17, 7:48 AM
miyuki edited reviewers, added: labrinea; removed: miyuki.
miyuki updated this revision to Diff 272406.Mon, Jun 22, 6:30 AM
  1. Rebased and fixed failures
  2. Added a test for AArch64 codegen of lane copying intrinsics
  3. Addressed reviewers' comments
miyuki marked 5 inline comments as done.Mon, Jun 22, 6:32 AM
miyuki added inline comments.
clang/utils/TableGen/NeonEmitter.cpp
1064

If we were to refactor this, it should be done in a separate patch.

stuij accepted this revision.Tue, Jun 23, 4:23 AM

LGTM

This revision is now accepted and ready to land.Tue, Jun 23, 4:23 AM
This revision was automatically updated to reflect the committed changes.