This is an archive of the discontinued LLVM Phabricator instance.

API for retrieving default FPU target features from TargetParser
ClosedPublic

Authored by labrinea on Jul 10 2015, 9:05 AM.

Details

Summary

Implemented as a table lookup

Diff Detail

Event Timeline

labrinea updated this revision to Diff 29447.Jul 10 2015, 9:05 AM
labrinea retitled this revision from to Use TargetParser to retrieve default FPU target features.
labrinea updated this object.
labrinea set the repository for this revision to rL LLVM.
labrinea added inline comments.Jul 10 2015, 9:16 AM
lib/Basic/Targets.cpp
4302 ↗(On Diff #29447)

the above should be cached in ARMTargetInfo, as class variables. TODO: http://reviews.llvm.org/D10839

4376 ↗(On Diff #29447)

This in not used yet. It will be used in an upcoming patch for ACLE 2.0 feature macros, which I am currently working on.

test/CodeGen/arm-long-calls.c
6 ↗(On Diff #29447)

This test has been recently moved from test/Driver. I had to modify these since more target features may be present in the "target-features" string (ex. +vfp*)

rengolin edited edge metadata.Jul 10 2015, 9:31 AM

Hi Alexandros,

Please don't mix Clang and LLVM patches on the same review. They will have to be committed separately and the order does matter.

Getting the default FPU from a CPU is something that needs to stay in ARMTargetParser, and required by the Clang part, so that one needs to go first, as a separate patch. It also needs tests for every combination.

cheers,
--renato

include/llvm/Support/TargetParser.h
181

A better name for this would be

getDefaultFPU(StringRef CPU)
lib/Basic/Targets.cpp
4302 ↗(On Diff #29447)

I'd hope you could do that before this patch, as working on something that will be removed is a bit pointless.

4314 ↗(On Diff #29447)

Unaligned has nothing to do with FPU, please move it to another patch.

4325 ↗(On Diff #29447)

We have CPU IDs in ARMTargetParser for that purpose, please do not parse strings anywhere else.

4331 ↗(On Diff #29447)

Again, use CPU IDs.

4376 ↗(On Diff #29447)

Then, please don't include it here. Save it for your ACLE patch.

4401 ↗(On Diff #29447)

Again, unaligned is a different patch.

lib/Support/TargetParser.cpp
265

This most definitely must be implemented as a table lookup. That's the whole point of ARMTargetParser is to move from a C++ implementation to a table-gen-like one, so we can move it to table-gen once we finish.

Please, convert this to a table lookup.

302

If you had this in a table you wouldn't need a comment. :)

test/CodeGen/arm-long-calls.c
6 ↗(On Diff #29447)

Looks like this could be a small patch on its own, with some evidence to support it separately.

labrinea added inline comments.Jul 13 2015, 2:22 AM
lib/Basic/Targets.cpp
4325 ↗(On Diff #29447)

I can only see Arch IDs in ARMTargetParser, and not CPU IDs.

rengolin added inline comments.Jul 13 2015, 2:27 AM
lib/Basic/Targets.cpp
4325 ↗(On Diff #29447)

Damn, you're right. I remember now that I *wanted* to implement that, but it doesn't make much sense. What we need is to map whatever is needed here based on CPU names and add a column to the CPU table. But not on this patch.

Please, just add a FIXME to use ARMTargetParser.

Could you be more specific when saying that we need to test every combination? Moreover, how can we isolate the tests from clang, since the Driver generates the target features?

Could you be more specific when saying that we need to test every combination? Moreover, how can we isolate the tests from clang, since the Driver generates the target features?

I believe the tests you added were additional to what was there already, testing that behaviour, so that'd be fine. If the tests in Clang test the functionality of TargetParser, it's ok.

Shouldn't the function call to getDefaultFPU(), which will be implemented as a table lookup, be tested in this patch? I am currently doing this in test/CodeGen/arm-target-features.c but it is in clang repository.

Shouldn't the function call to getDefaultFPU(), which will be implemented as a table lookup, be tested in this patch? I am currently doing this in test/CodeGen/arm-target-features.c but it is in clang repository.

In LLVM, the unit test (unittest/ADT/TripleTest.cpp) uses Triple to get the arch/cpu but not the fpu. Since the Triple class is going away, it's temporarily acceptable to test in Clang where the method is being used, if you can cover the expected behaviour.

Later on, we'll use TargetTuple and that will return FPU and all other target-specific architecture features, so we'll be able to do a full unit test on everything.

The way to go is to implement in LLVM and Clang, test in Clang, split the patch and mention that is being tested in Clang, maybe adding a FIXME to get a unit test for that, if you want.

cheers,
--renato

javed.absar added inline comments.Jul 13 2015, 8:58 AM
lib/Support/TargetParser.cpp
302

Why does VFPV4 etc appear below NEON column?
Instead of VFPV3_D16/Xd I think the proper nomenclature is VFPV3XD (please see FPUNames table in same file).

A qs (probably for Renato), is there a way to ensure consistency between what is declared here (which fpu) and what is stated in ARM.td ?

rengolin added inline comments.Jul 13 2015, 9:08 AM
lib/Support/TargetParser.cpp
302

Hi Javed,

There is currently no way to ensure consistency between this file and any table generated file. The ultimate goal of the target parsers is to have *all* tables generated from the *.td files, but for that to happen, we need to create a new TableGen back-end that is fast and produce those tables for *all* targets even if no back-end is built in, due to Clang needing them regardless.

Once the tables are stable, we can start moving them to tablegen and add a general rule to CMake to always build them.

Alexandros,

Feel free to add the missing CPUs with the respective FPU/NEON supports to the main table.

cheers,
--renato

CPUs are available in various versions. For example cortex-a5 can have (a) ARM::FK_NONE, (b) ARM::FK_VFPV4_D16, or (c) ARM::FK_NEON_VFPV4. What should be the default ? Maybe the most powerful?

ps. (Out of the scope of this patch): Clang specifies that using the -mfpu flag, and not the cpu name (there is no "cortex-a5.nofpu", "cortex-a5.vfp", "cortex-a5.neon"). Does clang driver check whether a combination cpu+fpu is valid? I think not. A way to assist clang to do that is to keep a table with all valid combinations.

Moreover, adding new cpu names in the table without adding them in ARM.td is not very useful.

CPUs are available in various versions. For example cortex-a5 can have (a) ARM::FK_NONE, (b) ARM::FK_VFPV4_D16, or (c) ARM::FK_NEON_VFPV4. What should be the default ? Maybe the most powerful?

The default should be what is most common, or most likely not to break things, not the most powerful. That's the whole point of having defaults and ways to change the options.

For instance, choosing VFPV4 may break things when your chip is VFPV4_D16. But if the availability of the latter is non-existent, then we should probably choose VFPV4 and let people specifically add -mfpu for their odd cores. The alternative is to generate inefficient code for the rest of the world.

That's why the default for ARMv7A is VFP3+NEON, and not VFPV3_D16 like GCC, or VFPV4+NEON, as would be the most powerful. There are still a lot of A8/A9 out there, and they rarely lack NEON (Tegra3 being the odd one out).

ps. (Out of the scope of this patch): Clang specifies that using the -mfpu flag, and not the cpu name (there is no "cortex-a5.nofpu", "cortex-a5.vfp", "cortex-a5.neon"). Does clang driver check whether a combination cpu+fpu is valid? I think not. A way to assist clang to do that is to keep a table with all valid combinations.

"-mcpu=cortex-a5+neon" is indeed valid, and keeping a table with *all* combinations is not a good use of space/time. The way we validate that now is to validate the first token as CPU and everything else as "options", either FPU, HDIV, etc. I'm not sure *everything* works, but it should.

Moreover, adding new cpu names in the table without adding them in ARM.td is not very useful.

As I said to Javed, this will go in TableGen, but not yet. First, we need to know *what* we need, and how we need, so that we can create a new TableGen backend to generate it.

The reason why we need a new backend is because *all* back-end info will run on *every* build, even if no back-ends are built with it. This way, Clang and other external tools can still use the same parser, even if they don't build all back-ends.

So this back-end has to be small, simple and *really* fast. All TableGen files being generated on all builds is not an option.

labrinea updated this revision to Diff 29668.Jul 14 2015, 5:28 AM
labrinea edited edge metadata.
labrinea removed rL LLVM as the repository for this revision.
  • Default FPU implemented as a table lookup
  • Added cortex-r5f and cortex-m4f

Thanks Alexandros, this looks a lot better. Some comments...

lib/Support/TargetParser.cpp
141

Maybe DefaultFPU as a name?

195

Why FP16 for A9?

labrinea added inline comments.Jul 14 2015, 5:48 AM
lib/Support/TargetParser.cpp
141

fair enough

195

cortex-a9
Page 10, Paragraph 1.1 third dot: "The Cortex-A9 NEON MPE features are: [...] SIMD and scalar half-precision floating-point conversion"
Page 11, Paragraph 1.1.1, line 3: "It provides conversions between 16-bit, 32-bit and 64-bit floating-point formats [...]"

Be careful - you are linking to the internal infocentre rather than the public one.

Always link to http://infocenter.arm.com/help/index.jsp for external reference.

rengolin accepted this revision.Jul 14 2015, 5:55 AM
rengolin edited edge metadata.

With the variable name change, LGTM. Thanks!

lib/Support/TargetParser.cpp
195

fair enough.

This revision is now accepted and ready to land.Jul 14 2015, 5:55 AM
labrinea added inline comments.Jul 14 2015, 5:55 AM
lib/Support/TargetParser.cpp
195

cortex-a9
Page 10, Paragraph 1.1 third dot: "The Cortex-A9 NEON MPE features are: [...] SIMD and scalar half-precision floating-point conversion"
Page 11, Paragraph 1.1.1, line 3: "It provides conversions between 16-bit, 32-bit and 64-bit floating-point formats [...]"

labrinea retitled this revision from Use TargetParser to retrieve default FPU target features to API for retrieving default FPU target features from TargetParser.Jul 14 2015, 6:29 AM
labrinea updated this object.
labrinea edited edge metadata.
javed.absar added inline comments.Jul 14 2015, 6:40 AM
lib/Support/TargetParser.cpp
142

perhaps a comment here to clarify 'default' applies to what (fpu?) ?

202

Are you sure about cortex-r5 having no default fpu.
Please see test/CodeGen/ARM/build-attributes.ll
where specifying no fpu seems to generate :
CORTEX-R5: .fpu vfpv3-d16

rengolin added inline comments.Jul 14 2015, 6:45 AM
lib/Support/TargetParser.cpp
142

That's a good point. It made sense when there was only one field. :)

202

The ARM website says it's optional. It depends on how common is the version with and the one without VFP.

But ultimately, yes, we want to keep the same logic everywhere. One of the main goals of this file. :)

Alexandros,

Feel free to submit a new patch with the changes proposed by Javed.

cheers,
--renato

jmolloy closed this revision.Jul 17 2015, 2:23 AM
jmolloy edited edge metadata.

[Phab spring clean] This got accepted.

labrinea added inline comments.Jul 17 2015, 5:24 AM
lib/Support/TargetParser.cpp
208

Having seen test/CodeGen/ARM/build-attributes.ll I can tell that cortex-m4 has VFPv4_SP_D16. That's the reason cortex-m4f is absent. Shall I commit a new patch with this update and remove the cortex-m4f entry?

rengolin added inline comments.Jul 17 2015, 5:43 AM
lib/Support/TargetParser.cpp
208

If you're not using, yes, please. We need to get all ideas about the targets consistent.

I wonder why is build-attributes outputting FPV4 to m4. It has to be getting this default from somewhere else, and it should be the job of this method to provide that info.

Can you investigate why it is so? I believe whatever method getting the default should be moved to use this table instead. You'll probably find many inconsistencies, that will need to be fixed. :)

javed.absar added inline comments.Jul 17 2015, 6:25 AM
lib/Support/TargetParser.cpp
208

I think it is outputting the fpu name from TargetParser (the FPUnAMES[ ] table):
OS << "\t.fpu\t" << ARMTargetParser::getFPUName(FPU) << "\n"

and it is reading the build attributes values from include/llvm/Support/ARMBuildAttributes.h:
"AllowFPv4B = 6, // v4 FP ISA was permitted,...."
which it indexes using e.g ARM::FK_FPV4_SP_D16 => setAttributeItem(ARMBuildAttrs::FP_arch, ARMBuildAttrs::AllowFPv4B

In other words, I think you just need to check to ensure that ARMBuildAttributes.h match with TRM and FPUNames[ ] table.
But I could be still missing something..