This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Replace fp-only-sp and d16 with fp64 and d32.
ClosedPublic

Authored by simon_tatham on Apr 15 2019, 5:56 AM.

Details

Summary

Those two subtarget features were awkward because their semantics are
reversed: each one indicates the _lack_ of support for something in
the architecture, rather than the presence. As a consequence, you
don't get the behavior you want if you combine two sets of feature
bits.

Each SubtargetFeature for an FP architecture version now comes in four
versions, one for each combination of those options. So you can still
say (for example) '+vfp2' in a feature string and it will mean what
it's always meant, but there's a new string '+vfp2d16sp' meaning the
version without those extra options.

A lot of this change is just mechanically replacing positive checks
for the old features with negative checks for the new ones. But one
more interesting change is that I've rearranged getFPUFeatures() so
that the main FPU feature is appended to the output list *before*
rather than after the features derived from the Restriction field, so
that -fp64 and -d32 can override defaults added by the main feature.

Diff Detail

Repository
rC Clang

Event Timeline

simon_tatham created this revision.Apr 15 2019, 5:56 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptApr 15 2019, 5:56 AM

I like the direction of this change, and the details look correct too.

The one thing I wonder about is whether we should be upgrading .bc files too (or otherwise support fp-only-sp in legacy inputs). I think it's a specialized enough feature that there won't be much older IR being mixed in, but can't be sure there's none. Anyone else got any views?

ostannard edited reviewers, added: ostannard; removed: olista01.Apr 17 2019, 9:19 AM
ostannard added a subscriber: ostannard.

For the auto-upgrader, these limited FPUs only exist for microcontrollers, where I doubt any users are keeping bitcode files around for a long time, so I'd be fine with not doing this. I've had a skim through the auto-upgrader code, and I don't see any other target-specific attributes which are handled there, though there are some target-specific intrinsics.

I don't think we here care about auto-updating, not supporting bitcode/lto libraries.

llvm/lib/Target/ARM/ARMFastISel.cpp
1362 ↗(On Diff #195149)

Some of the formatting here got a little messed up.

ostannard added inline comments.May 3 2019, 6:08 AM
clang/test/Driver/arm-mfpu.c
112 ↗(On Diff #195149)

"+d32" ?

llvm/lib/Target/ARM/ARMSubtarget.h
587 ↗(On Diff #195149)

Are the old functions still used anywhere? If they are not used (much), I think it would be better to just have one set of functions for the base FPU version, and check hasFP64 and hasD32 where needed, to avoid the rick of using the wrong version of these functions.

llvm/test/MC/ARM/armv8.3a-js.s
16 ↗(On Diff #195149)

Do you know why this diagnostic got worse?

simon_tatham marked an inline comment as done.May 28 2019, 3:16 AM
simon_tatham added inline comments.
llvm/test/MC/ARM/armv8.3a-js.s
16 ↗(On Diff #195149)

I do now :-) It's because that instruction is invalid for two reasons: firstly, vjcvt requires FP-ARMv8 which the command line has turned off, and secondly, using d18 requires the new d32 feature which the command line has also turned off. So llc -debug reports "Opcode result: multiple types of mismatch, so not reporting near-miss".

I think this should fix all the outstanding review comments on this patch.

simon_tatham marked 4 inline comments as done.May 28 2019, 6:59 AM
simon_tatham added inline comments.
llvm/lib/Target/ARM/ARMSubtarget.h
587 ↗(On Diff #195149)

Indeed, it turned out the old functions weren't used anywhere at all.

SjoerdMeijer accepted this revision.May 28 2019, 7:07 AM

There already was consensus that this is was a good change, and also that we don't care about auto-upgrading. With the last minor comments addressed, this looks good I think. Perhaps you can wait with committing until tomorrow morning just in case there are more comments.

This revision is now accepted and ready to land.May 28 2019, 7:07 AM
This revision was automatically updated to reflect the committed changes.

Hi Simon et. al., I'm working on a downstream ARM toolchain and have downstreamed this change into our codebase.
We saw that you've fixed the -mfpu=none issue and have taken that as well, but are still running into some issues.

Prior to your change, the optionset "-mcpu=cortex-m4 -mfloat-abi=hard" created the following target features in the LLVM IR coming out of clang:
"target-features"="+armv7e-m,+d16,+dsp,+fp-only-sp,+hwdiv,+thumb-mode,+vfp4,-crc,-dotprod,-fp16fml,-hwdiv-arm,-ras"

In specific, note the +d16, +fp-only-sp.

After your changes, we're not seeing the associated -d32, -fp64 target features:
"target-cpu"="cortex-m4" "target-features"="+armv7e-m,+dsp,+hwdiv,+thumb-mode,+vfp4,-crc,-dotprod,-fp16fml,-hwdiv-arm,-ras"

As a result, we are getting linktime failures between our libraries, which use "-mcpu=cortex-m4 -mfloat-abi=hard", and our tests, which specifically call out the FPU version being used.
An interesting note is that the target machine table has the following:

def : ProcessorModel<"cortex-m4", CortexM4Model,        [ARMv7em,
                                                         FeatureVFP4_D16_SP,
                                                         FeaturePrefLoopAlign32,
                                                         FeatureHasSlowFPVMLx,
                                                         FeatureUseMISched,
                                                         FeatureUseAA,
                                                         FeatureHasNoBranchPredictor]>;

Would this not mean that we'd expect vfp4-d16-sp when not otherwise specified? If not, then what is the expected behavior of -mfloat-abi=hard in the absence of an -mfpu specification?

Thanks,
J.B. Nagurne
Texas Instruments
Code Generation

simon_tatham marked an inline comment as done.Jun 4 2019, 3:01 AM

Hmm, yes, I see what you mean. It looks to me as if the problem is that llvm::ARM::getFPUFeatures is setting up a feature list including +vfp4, -fp64 and -d32 just as you'd expect, but when it's called from clang::targets::ARMTargetInfo::initFeatureMap in particular, that silently throws away any feature name in the list that doesn't start with +.

I suppose that means getFPUFeatures ought to be more careful, and add the more restricted feature vfp4d16sp in the first place instead of trying to do it by adding full vfp4 and then taking pieces away.