This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Fix armv8 features tree and add fp16fml
AbandonedPublic

Authored by dnsampaio on Apr 17 2019, 8:51 AM.

Details

Summary

This patch adds the fp16fml feature parser as well fixes
the FPU and the HW_FP flags when +fullfp16 and +dotprod
features are passed, to account for pre-requisite features.

The ARM backend (ARM.td) defines this tree of feature dependencies:

 fp16  vfp3
   |  /   |
  vfp4    neon
   |          \
fp-armv8       FeatureDotProd
   |
fullfp16
   |
fp16fml

However, clang does not capture that when using +fullfp16 we
also have vfp4, so compiling
tools/clang/test/CodeGen/arm_neon_intrinsics.c

with

clang -target armv8a-linux-eabi -march=armv8.4-a+fp16 -S -emit-llvm

will give an error because vfp4 is not added to the FPU flag.

As test now we can compile that test file with the command

clang -target armv8a-linux-eabi -march=armv8-a+fp16fml -S -emit-llvm

Diff Detail

Event Timeline

dnsampaio created this revision.Apr 17 2019, 8:51 AM
ostannard added inline comments.
lib/Basic/Targets/ARM.cpp
443

Is it always correct to set HW_FP_DP here, now that MVE can have full fp16 without double-precision? I'll add Simon since he's working on that.

444

Should this be FPARMV8, since fullfp16 doesn't apply to earlier architectures? Maybe MVE complicates this even further?

446

Should this also add FPARMV8?

452

Again, should this be FPARMV8?

453

You are adding HW_FP_HP twice.

test/CodeGen/arm_neon_intrinsics.c
8

Does the generate code differ enough to have a second copy of it? Actually, I assume the problem here is that we are not setting the correct preprocessor macros? in which case, it would be better to test them directly, than by re-running this 21kloc test.

dnsampaio marked 5 inline comments as done.Apr 18 2019, 1:58 AM
dnsampaio added inline comments.
lib/Basic/Targets/ARM.cpp
443

If it is V8 and does not have double-precision, isn't that going to use the argument +fp-only-sp ? That will disable WH_FP_DP using lines 436 and 456.

else if (Feature == "+fp-only-sp") {
HW_FP_remove |= HW_FP_DP;
HW_FP &= ~HW_FP_remove;
444

Correct, it should also add FeatureFPARMv8.
See that this is a accumulative flag, as FPARMv8 implies VFP4FPU, both should be set.
Actually, according the backend, FeatureFPARMv8 -> FeatureVFP4 -> FeatureVFP3 -> FeatureVFP2. From my tests we already set FeatureVFP3, but not FeatureVFP4

446

As well, yes.

452

True.

test/CodeGen/arm_neon_intrinsics.c
8

That indeed seems a better solution.

dnsampaio planned changes to this revision.Apr 18 2019, 2:28 AM

Waiting for the outcome of D60691.

dnsampaio abandoned this revision.Jun 5 2019, 9:13 AM

Fixed.

test/CodeGen/arm_neon_intrinsics.c