This is an archive of the discontinued LLVM Phabricator instance.

Enable fma and f16c features when avx512f is enabled.
AbandonedPublic

Authored by ab on Jun 22 2015, 2:59 PM.

Details

Reviewers
craig.topper
Summary

Since, according to the ISA extension manual, (EVEX versions of) the FMA3/F16C instructions are a core part of AVX512F, it seems to me we should enable the features when avx512f is enabled (e.g., -mavx512f). If not, users need to either specify a specific CPU, or add the redundant -mfma/-mf16c.

Diff Detail

Event Timeline

ab updated this revision to Diff 28164.Jun 22 2015, 2:59 PM
ab retitled this revision from to Enable fma and f16c features when avx512f is enabled..
ab updated this object.
ab edited the test plan for this revision. (Show Details)
ab added a reviewer: craig.topper.
ab added a subscriber: Unknown Object (MLST).

Also, we should probably do the same for LLVM; if that sounds good I'll submit another patch.

Thanks!

craig.topper edited edge metadata.Jun 22 2015, 11:56 PM

Are the 512-bit intrinsics in clang's headers qualified with F16C and FMA
defines? They don't seem to be in gcc's headers. And gcc doesn't seem to
auto enable FMA or F16C when avx512 is specified.

Similar question for the instructions in llvm.

ab added a comment.Jun 23 2015, 1:23 PM

Are the 512-bit intrinsics in clang's headers qualified with F16C and FMA
defines? They don't seem to be in gcc's headers. And gcc doesn't seem to
auto enable FMA or F16C when avx512 is specified.

Similar question for the instructions in llvm.

They're not?

But, if that's what you're getting at: if the defines have a 1-to-1 mapping with the intrinsics/headers (I think they do?), then on second thought, this isn't the right fix: +avx512f says nothing about whether we support FMA/F16C intrinsics, only the AVX512F ones.

Then the actual issue here is that, for fma and f16c, LLVM codegen relies on +fma/+f16c only to mark say f32 ISD::FMA as legal, and generate X86ISD::FMADD. Both of which are necessary to generate even the AVX512F instruction, and this makes it impossible with +avx512f alone.
Does that make sense? If so I'll post a patch!

-Ahmed

I think that makes sense. We need to enable ISD::FMA for the smaller types
when AVX512 is enabled. Probably need to add AVX512 in PerformFMACombine
and isFMAFasterThanFMulAndFAdd as well.

ab abandoned this revision.Jun 24 2015, 5:52 PM

Yep, simple enough: r240616, let me know if I missed anything.

The 213->231 xform, I'll leave for another day, since we can't
currently disassemble the 231/132 variants: PR23937.
Also, F16C is a wee bit trickier, and not just about flags: PR23941.

-Ahmed