This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Allow "-march=foo+fp" to vary with foo.
ClosedPublic

Authored by SjoerdMeijer on Apr 15 2019, 5:57 AM.

Details

Summary

Now, when clang processes an argument of the form "-march=foo+x+y+z",
then instead of calling getArchExtFeature() for each of the extension
names "x", "y", "z" and appending the returned string to its list of
low-level subtarget features, it will call appendArchExtFeatures()
which does the appending itself.

The difference is that appendArchExtFeatures can add _more_ than one
low-level feature name to the output feature list if it has to, and
also, it gets told some information about what base architecture and
CPU the extension is going to go with, which means that "+fp" can now
mean something different for different CPUs. Namely, "+fp" now selects
whatever the _default_ FPU is for the selected CPU and/or
architecture, as defined in the ARM_ARCH or ARM_CPU_NAME macros in
ARMTargetParser.def.

On the clang side, I adjust DecodeARMFeatures to call the new
appendArchExtFeatures function in place of getArchExtFeature. This
means DecodeARMFeatures needs to be passed a CPU name and an ArchKind,
which meant changing its call sites to make those available, and also
sawing getLLVMArchSuffixForARM in half so that you can get an ArchKind
enum value out of it instead of a string.

Also, I add support here for the extension name "+fp.dp", which will
automatically look through the FPU list for something that looks just
like the default FPU except for also supporting double precision.

Diff Detail

Event Timeline

simon_tatham created this revision.Apr 15 2019, 5:57 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptApr 15 2019, 5:57 AM

This needs some tests. I'm also not quite sure when you'd use bare "+fp", if it's the default anyway.

llvm/lib/Support/ARMTargetParser.cpp
487–490

I think isNegated is probably more in line with existing naming.

515–518

Doesn't this mean +nofp.dp disables all floats? That seems surprising behaviour to me, but I can't find any existing tests covering it.

527–528

Doesn't this silently disable the FPU entirely if we decide "fp.dp" is useless? That seems unlikely to be what a user wants, especially without a diagnostic.

Could you also expand on the comment a bit more. I had to look up exactly what FPURestrictions existed to get this, and I'm not even 100% sure I'm right now.

simon_tatham marked 3 inline comments as done.Apr 16 2019, 2:53 AM

The aim of this change is that it will apply to the v8.1-M (mainline) architecture introduced in D60698, in which +fp won't be the default: -march=armv8.1m.main by itself gives you the base 8.1-M architecture without any FP, -march=armv8.1m.main+fp gives you the optional single-precision FP extension on top of that, and +fp.dp gives you double precision as well.

llvm/lib/Support/ARMTargetParser.cpp
487–490

Hmmm. I thought that would be a confusing name because it hides the fact that the function strips off the no prefix. (The use of 'was' was intended to hint that by the time the function returns, it's not true any more!)

Perhaps stripNegationPrefix (returning bool to indicate success)?

515–518

Hmmm, that's a good point. What would a user expect in that situation? If double-precision FP was the default for that architecture and a single-precision version existed, then perhaps nofp.dp should fall back to that, but what if it's double or nothing?

527–528

I don't think it silently disables it, does it? Returning false from this function is a failure indication that ends up back in checkARMArchName in clang/lib/Driver/ToolChains/Arch/ARM.cpp, which will generate a diagnostic. For example, if I try -march=armv6m+fp.dp then I see

error: the clang compiler does not support '-march=armv6m+fp.dp'
t.p.northover added inline comments.Apr 16 2019, 3:15 AM
llvm/lib/Support/ARMTargetParser.cpp
487–490

Ah yes, I see. I think your alternative is probably better.

515–518

I think I'd go for a diagnostic in that case. There's already a way to strip out the FPU then (+nofp).

527–528

Ah, I missed the only way return true could happen and assumed the return value was vestigial. Sorry.

SjoerdMeijer commandeered this revision.May 31 2019, 1:10 AM
SjoerdMeijer edited reviewers, added: simon_tatham; removed: SjoerdMeijer.

This addresses @t.p.northover comment.

ostannard requested changes to this revision.May 31 2019, 6:32 AM
ostannard added a subscriber: ostannard.

This still needs tests adding.

This revision now requires changes to proceed.May 31 2019, 6:32 AM
SjoerdMeijer marked an inline comment as done.May 31 2019, 6:46 AM

Ah yes, the school boy error! ;-) Actually, there was a test, but in a different patch; I will move it to here.

This time with tests.

ostannard added inline comments.Jun 3 2019, 3:03 AM
llvm/lib/Support/ARMTargetParser.cpp
551

Could you also add tests for the error cases here? I think these are:

  • +fp.dp, but the FPU is already double-precision
  • +fp.dp, but no double-precision FPU exists (are there any FPUs which cause this?)
  • +[no]fp or +[no]fp.dp for a CPU/arch which doesn't have any FPUs.

I also don't see any tests for the negated forms of either feature.

Hi Oliver, thanks for your comments!

This was the easy one, they have been added:

I also don't see any tests for the negated forms of either feature.

The trouble begun with this:

+fp.dp, but the FPU is already double-precision
+fp.dp, but no double-precision FPU exists (are there any FPUs which cause this?)
+[no]fp or +[no]fp.dp for a CPU/arch which doesn't have any FPUs.

Because I found that basically none of this worked. The main reason was that we were always passing generic. To address that we at least have a chance of seeing a sensible CPU name, I have swapped the order of parsing -march and -mcpu. I.e., we parse -mcpu first, and pass that to checkARMArchName, which will eventually call appendArchExtFeatures. I think that makes more sense when we use the CPUname to query getDefaultFPU.

Then about the more fancy diagnostics (e.g. "fp.dp, but the FPU is already double-precision"): I've removed any attempt to throw clever diagnostics. I don't think, in general, that we provide this kind of service level. I.e., we need to do a lot more work here to avoid a meaningless, confusing, and thus useless "--march=... not supported" error message when we provide +fp.dp on the -march when e.g. the CPU already enabled this.

ostannard accepted this revision.Jun 5 2019, 3:53 AM

Fair enough, I don't think we currently try to diagnose any other invalid combinations of features. LGTM.

This revision is now accepted and ready to land.Jun 5 2019, 3:53 AM
This revision was automatically updated to reflect the committed changes.