Page MenuHomePhabricator

[ARM] Fix bugs introduced by the fp64/d32 rework.
ClosedPublic

Authored by simon_tatham on Jun 7 2019, 1:14 AM.

Details

Summary

Change D60691 caused some knock-on failures that weren't caught by the
existing tests. Firstly, selecting a CPU that should have had a
restricted FPU (e.g. -mcpu=cortex-m4, which should have 16 d-regs
and no double precision) could give the unrestricted version, because
ARM::getFPUFeatures returned a list of features including subtracted
ones (here -fp64,-d32), but ARMTargetInfo::initFeatureMap threw
away all the ones that didn't start with +. Secondly, the
preprocessor macros didn't reliably match the actual compilation
settings: for example, -mfpu=softvfp could still set __ARM_FP as
if hardware FP was available, because the list of features on the cc1
command line would include things like +vfp4,-vfp4d16 and clang
didn't realise that one of those cancelled out the other.

I've fixed both of these issues by rewriting ARM::getFPUFeatures so
that it returns a list that enables every FP-related feature
compatible with the selected FPU and disables every feature not
compatible, which is more verbose but means clang doesn't have to
understand the dependency relationships between the backend features.
Meanwhile, ARMTargetInfo::handleTargetFeatures is testing for all
the various forms of the FP feature names, so that it won't miss cases
where it should have set HW_FP to feed into feature test macros.

That in turn caused an ordering problem when handling -mcpu=foo+bar
together with -mfpu=something_that_turns_off_bar. To fix that, I've
arranged that the +bar suffixes on the end of -mcpu and -march
cause feature names to be put into a separate vector which is
concatenated after the output of getFPUFeatures.

Another side effect of all this is to fix a bug where `clang -target
armv8-eabi` by itself would fail to set __ARM_FEATURE_FMA, even
though armv8 (aka Arm v8-A) implies FP-Armv8 which has FMA. That was
because HW_FP was being set to a value including only the FPARMV8
bit, but that feature test macro was testing only the VFP4FPU bit.
Now HW_FP ends up with all the bits set, so it gives the right
answer.

Changes to tests included in this patch:

  • arm-target-features.c: I had to change basically all the expected results. (The Cortex-M4 test in there should function as a regression test for the accidental double-precision bug.)
  • arm-mfpu.c, armv8.1m.main.c: switched to using CHECK-DAG everywhere so that those tests are no longer sensitive to the order of cc1 feature options on the command line.
  • arm-acle-6.5.c: been updated to expect the right answer to that FMA test.
  • Preprocessor/arm-target-features.c: added a regression test for the mfpu=softvfp issue.

Diff Detail

Repository
rL LLVM

Event Timeline

simon_tatham created this revision.Jun 7 2019, 1:14 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 7 2019, 1:14 AM
ostannard accepted this revision.Jun 7 2019, 1:55 AM

LGTM with one change.

clang/lib/Driver/ToolChains/Arch/ARM.cpp
294 ↗(On Diff #203506)

You explained it in the commit message, but I think this could do with a comment in the code too.

This revision is now accepted and ready to land.Jun 7 2019, 1:55 AM
SjoerdMeijer added inline comments.Jun 7 2019, 2:01 AM
clang/lib/Driver/ToolChains/Arch/ARM.cpp
103 ↗(On Diff #203506)

Nitpick: can you comment in code somewhere what the purpose is of FeaturesAfter? Can it be renamed to something more descriptive? I don't know what it could be to be honest...

449 ↗(On Diff #203506)

This function and this code is truly one of the worst I guess.... I mean, it's not worse than it was before, but you'd really hope that there was a (targetparser) API that gives you a list of FPUs, possibly matching some criteria, rather than yet another list of hard coded strings. It gets worse, this is repeated in lib/Support/ARMTargetParser.cpp. End of rant! ;-)

More on topic: do we need to add +mve.fp here?

llvm/lib/Support/ARMTargetParser.cpp
178 ↗(On Diff #203506)

This looks familiar.... I don't think we can easily simplify this? Or get rid of? I guess that's best done in a follow up when we this working again? Perhaps the only insignificant simplification is that we don't need the +<string> and -<string> because they are the same.

simon_tatham marked 2 inline comments as done.Jun 7 2019, 2:11 AM
simon_tatham added inline comments.
clang/lib/Driver/ToolChains/Arch/ARM.cpp
449 ↗(On Diff #203506)

It did occur to me that this code could be made a lot simpler if clang had direct access to the data about the dependency relations between subtarget features. Then it could replace all of this with a simpler and more robust set of tests for things like "does the transitive closure of the feature set include vfp4d16sp?" instead of having to list every individual feature that might appear. But currently that dependency web lives in the ARM target's tablegen, so it isn't guaranteed to be compiled in to the LLVM that clang is linking against.

llvm/lib/Support/ARMTargetParser.cpp
178 ↗(On Diff #203506)

As I say in the comment a few lines up, we need both versions of the string because this function is returning a vector of StringRef with no mechanism to free anything it allocates, so it has to return pointers to static string literals.

(I could do some preprocessor fiddling to expand a single feature name into both strings, but I tend to assume most people have a less strong stomach for cpp horrors than I do :-)

Renamed FeaturesAfter to ExtensionFeatures and added a comment
explaining why it has to be separate.

simon_tatham marked 2 inline comments as done.Jun 7 2019, 3:28 AM

Cheers, LGTM too.

This revision was automatically updated to reflect the committed changes.