This is an archive of the discontinued LLVM Phabricator instance.

PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4
ClosedPublic

Authored by sbaranga on Apr 11 2016, 6:54 AM.

Details

Summary

According to the ACLE spec, "__ARM_FEATURE_FMA is defined to 1 if
the hardware floating-point architecture supports fused floating-point
multiply-accumulate".

This changes clang's behaviour from emitting this macro for v7-A and v7-R
cores to only emitting it when the target has VFPv4 (and therefore support
for the floating point multiply-accumulate instruction).

Fixes PR27216

Diff Detail

Event Timeline

sbaranga updated this revision to Diff 53232.Apr 11 2016, 6:54 AM
sbaranga retitled this revision from to PR27216: Only define __ARM_FEATURE_FMA when the target has VFPv4.
sbaranga updated this object.
sbaranga added a reviewer: t.p.northover.
sbaranga added a subscriber: cfe-commits.
rengolin added inline comments.
lib/Basic/Targets.cpp
4935

I think just two checks are necessary, here:

(FPU & VFPV4FPU) || (ArchVersion > 7)

and make sure that the right FPU flag is set from the right cores, plus "+vfp4".

test/CodeGen/arm-neon-fma.c
6

why not change the cpu to a core that has vfp4?

I know the test is about FMA, not the CPU, but this is a combination that will never occur in the wild...

test/Sema/arm_vfma.c
1

It's possible that v7 Apple cores always have FMA? I'd make sure of that before forcing the flag here. We don't want to disable it inadvertently.

@t.p.northover, can you confirm Apple's support for VFP4?

sbaranga added inline comments.Apr 11 2016, 7:26 AM
lib/Basic/Targets.cpp
4935

Yes, that should be sufficient.

test/CodeGen/arm-neon-fma.c
6

Sure, good point.

test/Sema/arm_vfma.c
1

If they do support it and don't have the vfp4 feature, then before this patch clang/llvm wouldn't have emitted a fma/vfma instruction anyway in any circumstances (because the backend will not generate it). The backend would instead legalize it with fmaf() libcalls - but that's not the correct behaviour according to the spec.

sbaranga updated this revision to Diff 53254.Apr 11 2016, 9:02 AM

Apply review comments from Renato:

  • simplify condition for enabling __ARM_FEATURE_FMA
  • use cortex-a7 instead of cortex-a8 for testing since this is a real use case.
rengolin accepted this revision.Apr 11 2016, 9:30 AM
rengolin edited edge metadata.

LGTM. Thanks!

This revision is now accepted and ready to land.Apr 11 2016, 9:30 AM
t.p.northover added inline comments.Apr 11 2016, 10:49 AM
test/Sema/arm_vfma.c
1

v7s is Swift, which has FMA. v7 for us is Cortex-A9, which I think also has FMA (not that it matters much these days).

rengolin added inline comments.Apr 11 2016, 12:18 PM
test/Sema/arm_vfma.c
1

v7 is Cortex-A8, and neither A8 nor A9 have FMA in VFP, only NEON.

Does Swift have FMA in VFP? or just NEON?

t.p.northover added inline comments.Apr 11 2016, 12:25 PM
test/Sema/arm_vfma.c
1

Sorry, it appears virtually every part of my statement was wrong then. v7 really does seem to be Cortex-A8 even for us, and Swift doesn't have scalar VFMA.

sbaranga added inline comments.Apr 12 2016, 8:01 AM
test/Sema/arm_vfma.c
1

The error seems to be coming from how the getDefaultFPU() is called when the cpu is not specified. It turns out that it gets called with an empty CPU string (perhaps we meant to call with either "generic" or the CPU that was set in ARMTargetInfo (which does get correctly recognized as swift in this case).

FWIW, Cortex-A9 doesn't have FMA,

sbaranga updated this revision to Diff 53409.Apr 12 2016, 8:44 AM
sbaranga edited edge metadata.

If no cpu has been passed to the command line, use the generic cpu when selecting
features/FPU, instead of using an empty string (which is not recognized by the
TargetParser).

I've updated the patch to fix the defaults when the cpu is not specified. Renato, Tim, could you have a look at this again please?

Thanks,
Silviu

A gentle ping?

Cheers,
Silviu

rengolin added inline comments.Apr 26 2016, 6:33 AM
lib/Basic/Targets.cpp
4710

This change is unrelated and may bring side effects into clang. I'd keep this out and investigate it in another patch with the appropriate tests. If you just force the target-feature in the test, this corner case won't be relevant in this patch.

test/Sema/arm_vfma.c
1

This test should not be relying on the front-end getting the feature right, it should be forcing the vfpv4 target feature on a base arch like "arm-none-eabi".

test/Sema/neon-vector-types-support.c
1 ↗(On Diff #53409)

This change seems unrelated?

sbaranga added inline comments.Apr 26 2016, 6:39 AM
lib/Basic/Targets.cpp
4710

Ok, that makes sense. I'll revert to the previous revision of this patch.

sbaranga updated this revision to Diff 55018.Apr 26 2016, 8:58 AM

Address the latest review comments (which means rolling back to the last change).

sbaranga added inline comments.Apr 26 2016, 9:00 AM
test/Sema/arm_vfma.c
1

I updated this test, but used thumbv7-none-eabi here, since VFPv4 requires at least v7.

LGTM! Thanks!

sbaranga closed this revision.Apr 28 2016, 4:34 AM

Thanks, r267869!

-Silviu