Page MenuHomePhabricator

[clang][Driver][ARM] Favor -mfpu over default CPU features

Authored by labrinea on Jun 28 2019, 9:13 AM.



When processing the command line options march, mcpu and mfpu, we store the implied target features on a vector. The change D62998 introduced a temporary vector, where the processed features get accumulated. When calling DecodeARMFeaturesFromCPU, which sets the default features for the specified CPU, we certainly don't want to override the features that have been explicitly specified on the command line. Therefore, the default features should appear first in the final vector. This problem became evident once I added the missing (unhandled) target features in ARM::getExtensionFeatures.

Diff Detail


Event Timeline

labrinea created this revision.Jun 28 2019, 9:13 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 28 2019, 9:13 AM

The second change this patch makes

Could this be spilt into two patches?

I will let Oliver finish the review (I am off for a few days), just some drive-by comments.

412 ↗(On Diff #207071)

This could be a little local helper function, share the code, as exactly the same is done in ARM::appendArchExtFeatures

574 ↗(On Diff #207071)

I like this approach.

580 ↗(On Diff #207071)

but the fact that we have these still here, I guess that means they are not present in the table. Can we add them too? I guess that's why you've added fp.dp.

585 ↗(On Diff #207071)

Oops, thanks for fixing! :-)

The second change this patch makes

Could this be spilt into two patches?

Looking at D62998 more carefully I realized that we deliberately favor cpu extensions over -mfpu:

That in turn caused an ordering problem when handling -mcpu=foo+bar
together with -mfpu=something_that_turns_off_bar. To fix that, I've
arranged that the +bar suffixes on the end of -mcpu and -march
cause feature names to be put into a separate vector which is
concatenated after the output of getFPUFeatures.

I am now in doubt about my changes in clang/lib/Driver/ToolChains/Arch/ARM.cpp. Imagine this case:
-mcpu=cortex-a73 -mfpu=crypto-neon-fp-armv8
According to the table in ARMTargetParser, cortex-a73 doesn't have crypto, therefore the -crypto feature gets in the vector, but then we explicitly ask for it through the mfpu option. What is supposed to win here? FYI this a test case from clang/test/Driver/arm-cortex-cpus.c. An obvious workaround is to add the crypto extension for cortex-a73 (and any other entry which is missing it) in the table.

Maybe @simon_tatham could shed some light here?

412 ↗(On Diff #207071)

We are not doing exactly the same thing in these functions. Here we extract features out of a bitmap, which is a map containing a bitwise OR of separate feature bitmasks. If a bitmask that corresponds to a known feature is present - and here I mean all the bits of that mask are present - then we push the feature, otherwise not.

In ARM::appendArchExtFeatures we compare a given bitmask, which corresponds to a specific feature, against all the known bitmasks (individual features) one by one. In contrast to ARM::getExtensionFeatures here we also handle negative features, and that means the prohibition of a feature can disable other features that depend on it as well.

580 ↗(On Diff #207071)

Unfortunately we can't, meaning that the table is supposed to contain feature names that are valid command line options for mcpu, march and those are clearly not. Or at least, that's my understanding of it.

I can't shed as much light as you might hope, I'm afraid, but in D62998 my intention was not to make -mcpu=anything win over -mfpu=anything. It was to make an explicit request to enable a feature win over an implicit request to disable it. It so happened in my example that the explicit request was in the -mcpu option.

In the case you describe here, -mcpu=cortex-a73 -mfpu=crypto-neon-fp-armv8, the explicitness goes the other way round. Cortex-A73 might be listed in the CPU table without crypto, but that's not because it can't have crypto; it's because the cryptographic extension is an optional feature of the Cortex-A73, so it's just that it need not have crypto. So -mcpu=cortex-a73 doesn't mean "definitely no crypto", it means "I haven't said whether I want crypto yet". And then the FPU specification asks for crypto (and in this case, crypto is even mentioned in the text of the FPU name). So here, it's the FPU specification that is (semi-)explicitly turning the feature on. So if a human user had written that command line, I would expect that then they surely did want crypto, and would be surprised not to get it.

But that's the kind of messy human reasoning you'd like to avoid implementing in software. I don't have a simple, consistent, general rule that gives the right answer to all these fiddly cases.

@simon_tatham, thanks for clarifying. I think my change is doing the right thing then: favors the -mfpu option over the default CPU features. I will split the patch as @ostannard suggested.

labrinea updated this revision to Diff 207436.Jul 1 2019, 4:24 PM
labrinea retitled this revision from [ARM] Minor fixes in command line option parsing to [clang][Driver][ARM] Favor -mfpu over default CPU features.
labrinea edited the summary of this revision. (Show Details)
labrinea removed subscribers: cfe-commits, llvm-commits.

I've split the patch.

ostannard added inline comments.Jul 3 2019, 2:05 AM
412 ↗(On Diff #207071)

Does this correctly handle the combination of integer-only MVE with a scalar FPU? I think -march=...+mve+fp should enable AEK_FP and AEK_SIMD (+mve), but shouldn't enable +mve.fp.

574 ↗(On Diff #207071)

I'm not sure this is a good idea - we are now testing the implementation by checking that it matches the table used by the implementation, so if there's a bug in the table this will still pass.

labrinea added inline comments.Jul 3 2019, 5:54 AM
412 ↗(On Diff #207071)

The target features explicitly specified on the command line are handled by ARM::appendArchExtFeatures (see D64048). That said, yes, -march=...+mve+fp does enable scalar floating point and integer-only mve, but doesn't enable mve with floating point. I could possibly add such a test on that revision.

On the other hand, ARM::getExtensionFeatures cannot tell the difference between the two configurations you describe, and it's not possible to do so, because they are represented on a bitmap returned from ARM::getDefaultExtensions, which reads the table. That said, if there was an entry in the table containing AEK_FP | AEK_SIMD that would enable mve.fp. However, I don't think this is a problem in practice. My understanding is that the table indicates FPU support with FK_*, and Extension support with AEK_*. That said, I believe AEK_FP is only used for command line option handling.

Maybe a fix for this problem would be to replace AEK_FP | AEK_SIMD with something like AEK_MVE_FP but then we wouldn't be able to do what is proposed in D64048.

574 ↗(On Diff #207071)

Surely, but is the purpose of this test to check that the table is correct, or that ARM::getExtensionFeatures reads the table correctly? I'd say the latter. Also, with this change we won't need to update the test every time there's a new entry in the table.

ostannard added inline comments.Jul 4 2019, 1:55 AM
412 ↗(On Diff #207071)

Is this system (in particular the behaviour added in D64048) going to be able to handle all of the other dependencies between architectural features? For example, MVE also depends on the DSP extension, but -march=armv8.1-m.main+mve+nodsp currently defines both ARM_FEATURE_MVE and ARM_FEATURE_DSP.

labrinea added inline comments.Jul 4 2019, 3:04 AM
412 ↗(On Diff #207071)

No, -march=armv8.1-m.main+mve+nodsp doesn't turn off neither mve nor dsp and it looks like a bug if they depend on each other. It seems you are right, the code in ARMTargetInfo::handleTargetFeatures enables both when mve is set:

} else if (Feature == "+mve") {
  DSP = 1;
} else if (Feature == "+mve.fp") {
  DSP = 1;
  HasLegalHalfType = true;

If there's a dependency then it should be present in the table of target parser. Then the above command would turn both off. I'll update the table and add some tests in D64048.

labrinea marked 2 inline comments as done.Jul 4 2019, 3:33 AM
ostannard accepted this revision.Jul 8 2019, 5:37 AM
This revision is now accepted and ready to land.Jul 8 2019, 5:37 AM
This revision was automatically updated to reflect the committed changes.