This is an archive of the discontinued LLVM Phabricator instance.

[X86][BF16] Enable __bf16 for x86 targets.
ClosedPublic

Authored by FreddyYe on Aug 1 2022, 10:46 PM.

Diff Detail

Event Timeline

FreddyYe created this revision.Aug 1 2022, 10:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 1 2022, 10:46 PM
FreddyYe requested review of this revision.Aug 1 2022, 10:46 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 1 2022, 10:46 PM
FreddyYe retitled this revision from Enable __bf16 for x86 targets. to [X86][BF16] Enable __bf16 for x86 targets..Aug 1 2022, 10:49 PM

Add to ReleaseNotes.rst as well.

How are you actually implementing __bf16 on these targets? There isn't even hardware support for conversions.

How are you actually implementing __bf16 on these targets? There isn't even hardware support for conversions.

bf16 -> float is really just a bit shift. The other direction gets lowered to a libcall, compiler-rt has a conversion function with proper rounding. I added some support to make the backend promote all other arithmetic to float, but I think that's only enabled on x86 so far.

How are you actually implementing __bf16 on these targets? There isn't even hardware support for conversions.

We support float -> bf16 in AVX512BF16. https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#avx512techs=AVX512_BF16
And we found some problems in how to represent bf16 types in intrinsics. For example, we currently defined __bfloat16 as unsigned short. We cannot stop user e.g., adding 2 __bfloat16 in C code and getting the wrong result. So we want to introduce the type on X86. For more information, please see the discussions in D120395,

How are you actually implementing __bf16 on these targets? There isn't even hardware support for conversions.

bf16 -> float is really just a bit shift. The other direction gets lowered to a libcall, compiler-rt has a conversion function with proper rounding. I added some support to make the backend promote all other arithmetic to float, but I think that's only enabled on x86 so far.

Yes, we can view x86 backend has been dealing with __bf16. And with https://reviews.llvm.org/D130832, it will complete follow psABI. About hardware support, x86 actually has supported bf16 since AVX512BF16 (https://reviews.llvm.org/D60552), which has vector conversion support between float and bf16. However, at that time we chose a typedef short as C type. In the future, we can support backend lowering for those instructions: VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS

How are you actually implementing __bf16 on these targets? There isn't even hardware support for conversions.

bf16 -> float is really just a bit shift. The other direction gets lowered to a libcall, compiler-rt has a conversion function with proper rounding. I added some support to make the backend promote all other arithmetic to float, but I think that's only enabled on x86 so far.

About hardware support, x86 actually has supported bf16 since AVX512BF16 (https://reviews.llvm.org/D60552), which has vector conversion support between float and bf16.

Right, but this patch is adding x86 support whenever SSE2 is available. AVX512BF16 is available on a *very* small slice of processors. In contrast, e.g. F16C is relatively broadly available, although I understand that we formally support _Float16 all the way back to SSE2 and thus on some processors that lack F16C.

But okay, pure intrinsic support is fine if that's what we're doing.

I think the patch looks fine.

Right, but this patch is adding x86 support whenever SSE2 is available. AVX512BF16 is available on a *very* small slice of processors. In contrast, e.g. F16C is relatively broadly available, although I understand that we formally support _Float16 all the way back to SSE2 and thus on some processors that lack F16C.

But okay, pure intrinsic support is fine if that's what we're doing.

I think the patch looks fine.

Yes. This type is for pure intrinsic support. Thanks for your review. Let's wait for the backend patch to land first.

pengfei accepted this revision.Aug 4 2022, 7:37 PM

LGTM.

This revision is now accepted and ready to land.Aug 4 2022, 7:37 PM
This revision was landed with ongoing or failed builds.Aug 9 2022, 6:41 PM
This revision was automatically updated to reflect the committed changes.