This is an archive of the discontinued LLVM Phabricator instance.

API for retrieving default FPU target features from TargetParser
ClosedPublic

Authored by labrinea on Jul 10 2015, 9:05 AM.

Download Raw Diff

Details

Reviewers

rengolin
jmolloy
richard.barton.arm

Summary

Implemented as a table lookup

Diff Detail

Event Timeline

labrinea updated this revision to Diff 29447.Jul 10 2015, 9:05 AM

labrinea retitled this revision from to Use TargetParser to retrieve default FPU target features.

labrinea updated this object.

labrinea added reviewers: rengolin, richard.barton.arm, jmolloy.

labrinea set the repository for this revision to rL LLVM.

Herald added a subscriber: aemerson. · View Herald TranscriptJul 10 2015, 9:05 AM

labrinea added inline comments.Jul 10 2015, 9:16 AM

lib/Basic/Targets.cpp
4302 ↗	(On Diff #29447)	the above should be cached in ARMTargetInfo, as class variables. TODO: http://reviews.llvm.org/D10839
4376 ↗	(On Diff #29447)	This in not used yet. It will be used in an upcoming patch for ACLE 2.0 feature macros, which I am currently working on.
test/CodeGen/arm-long-calls.c
6 ↗	(On Diff #29447)	This test has been recently moved from test/Driver. I had to modify these since more target features may be present in the "target-features" string (ex. +vfp*)

Hi Alexandros,

Please don't mix Clang and LLVM patches on the same review. They will have to be committed separately and the order does matter.

Getting the default FPU from a CPU is something that needs to stay in ARMTargetParser, and required by the Clang part, so that one needs to go first, as a separate patch. It also needs tests for every combination.

cheers,
--renato

include/llvm/Support/TargetParser.h
181	A better name for this would be getDefaultFPU(StringRef CPU)
lib/Basic/Targets.cpp
4302 ↗	(On Diff #29447)	I'd hope you could do that before this patch, as working on something that will be removed is a bit pointless.
4314 ↗	(On Diff #29447)	Unaligned has nothing to do with FPU, please move it to another patch.
4325 ↗	(On Diff #29447)	We have CPU IDs in ARMTargetParser for that purpose, please do not parse strings anywhere else.
4331 ↗	(On Diff #29447)	Again, use CPU IDs.
4376 ↗	(On Diff #29447)	Then, please don't include it here. Save it for your ACLE patch.
4401 ↗	(On Diff #29447)	Again, unaligned is a different patch.
lib/Support/TargetParser.cpp
265	This most definitely must be implemented as a table lookup. That's the whole point of ARMTargetParser is to move from a C++ implementation to a table-gen-like one, so we can move it to table-gen once we finish. Please, convert this to a table lookup.
302	If you had this in a table you wouldn't need a comment. :)
test/CodeGen/arm-long-calls.c
6 ↗	(On Diff #29447)	Looks like this could be a small patch on its own, with some evidence to support it separately.

labrinea added inline comments.Jul 13 2015, 2:22 AM

lib/Basic/Targets.cpp
4325 ↗	(On Diff #29447)	I can only see Arch IDs in ARMTargetParser, and not CPU IDs.

rengolin added inline comments.Jul 13 2015, 2:27 AM

lib/Basic/Targets.cpp
4325 ↗	(On Diff #29447)	Damn, you're right. I remember now that I wanted to implement that, but it doesn't make much sense. What we need is to map whatever is needed here based on CPU names and add a column to the CPU table. But not on this patch. Please, just add a FIXME to use ARMTargetParser.

Could you be more specific when saying that we need to test every combination? Moreover, how can we isolate the tests from clang, since the Driver generates the target features?

In D11100#203524, @labrinea wrote:

Could you be more specific when saying that we need to test every combination? Moreover, how can we isolate the tests from clang, since the Driver generates the target features?

I believe the tests you added were additional to what was there already, testing that behaviour, so that'd be fine. If the tests in Clang test the functionality of TargetParser, it's ok.

Shouldn't the function call to getDefaultFPU(), which will be implemented as a table lookup, be tested in this patch? I am currently doing this in test/CodeGen/arm-target-features.c but it is in clang repository.

javed.absar added a subscriber: javed.absar.Jul 13 2015, 3:54 AM

In D11100#203577, @labrinea wrote:

Shouldn't the function call to getDefaultFPU(), which will be implemented as a table lookup, be tested in this patch? I am currently doing this in test/CodeGen/arm-target-features.c but it is in clang repository.

In LLVM, the unit test (unittest/ADT/TripleTest.cpp) uses Triple to get the arch/cpu but not the fpu. Since the Triple class is going away, it's temporarily acceptable to test in Clang where the method is being used, if you can cover the expected behaviour.

Later on, we'll use TargetTuple and that will return FPU and all other target-specific architecture features, so we'll be able to do a full unit test on everything.

The way to go is to implement in LLVM and Clang, test in Clang, split the patch and mention that is being tested in Clang, maybe adding a FIXME to get a unit test for that, if you want.

cheers,
--renato

javed.absar added inline comments.Jul 13 2015, 8:58 AM

lib/Support/TargetParser.cpp
302	Why does VFPV4 etc appear below NEON column? Instead of VFPV3_D16/Xd I think the proper nomenclature is VFPV3XD (please see FPUNames table in same file). A qs (probably for Renato), is there a way to ensure consistency between what is declared here (which fpu) and what is stated in ARM.td ?

rengolin added inline comments.Jul 13 2015, 9:08 AM

lib/Support/TargetParser.cpp
302	Hi Javed, There is currently no way to ensure consistency between this file and any table generated file. The ultimate goal of the target parsers is to have all tables generated from the .td files, but for that to happen, we need to create a new TableGen back-end that is fast and produce those tables for all* targets even if no back-end is built in, due to Clang needing them regardless. Once the tables are stable, we can start moving them to tablegen and add a general rule to CMake to always build them. Alexandros, Feel free to add the missing CPUs with the respective FPU/NEON supports to the main table. cheers, --renato

CPUs are available in various versions. For example cortex-a5 can have (a) ARM::FK_NONE, (b) ARM::FK_VFPV4_D16, or (c) ARM::FK_NEON_VFPV4. What should be the default ? Maybe the most powerful?

ps. (Out of the scope of this patch): Clang specifies that using the -mfpu flag, and not the cpu name (there is no "cortex-a5.nofpu", "cortex-a5.vfp", "cortex-a5.neon"). Does clang driver check whether a combination cpu+fpu is valid? I think not. A way to assist clang to do that is to keep a table with all valid combinations.

Moreover, adding new cpu names in the table without adding them in ARM.td is not very useful.

In D11100#204443, @labrinea wrote:

CPUs are available in various versions. For example cortex-a5 can have (a) ARM::FK_NONE, (b) ARM::FK_VFPV4_D16, or (c) ARM::FK_NEON_VFPV4. What should be the default ? Maybe the most powerful?

The default should be what is most common, or most likely not to break things, not the most powerful. That's the whole point of having defaults and ways to change the options.

For instance, choosing VFPV4 may break things when your chip is VFPV4_D16. But if the availability of the latter is non-existent, then we should probably choose VFPV4 and let people specifically add -mfpu for their odd cores. The alternative is to generate inefficient code for the rest of the world.

That's why the default for ARMv7A is VFP3+NEON, and not VFPV3_D16 like GCC, or VFPV4+NEON, as would be the most powerful. There are still a lot of A8/A9 out there, and they rarely lack NEON (Tegra3 being the odd one out).

ps. (Out of the scope of this patch): Clang specifies that using the -mfpu flag, and not the cpu name (there is no "cortex-a5.nofpu", "cortex-a5.vfp", "cortex-a5.neon"). Does clang driver check whether a combination cpu+fpu is valid? I think not. A way to assist clang to do that is to keep a table with all valid combinations.

"-mcpu=cortex-a5+neon" is indeed valid, and keeping a table with *all* combinations is not a good use of space/time. The way we validate that now is to validate the first token as CPU and everything else as "options", either FPU, HDIV, etc. I'm not sure *everything* works, but it should.

Moreover, adding new cpu names in the table without adding them in ARM.td is not very useful.

As I said to Javed, this will go in TableGen, but not yet. First, we need to know *what* we need, and how we need, so that we can create a new TableGen backend to generate it.

The reason why we need a new backend is because *all* back-end info will run on *every* build, even if no back-ends are built with it. This way, Clang and other external tools can still use the same parser, even if they don't build all back-ends.

So this back-end has to be small, simple and *really* fast. All TableGen files being generated on all builds is not an option.

Default FPU implemented as a table lookup
Added cortex-r5f and cortex-m4f

Thanks Alexandros, this looks a lot better. Some comments...

lib/Support/TargetParser.cpp
141	Maybe DefaultFPU as a name?
195	Why FP16 for A9?

labrinea added inline comments.Jul 14 2015, 5:48 AM

lib/Support/TargetParser.cpp
141	fair enough
195	cortex-a9 Page 10, Paragraph 1.1 third dot: "The Cortex-A9 NEON MPE features are: [...] SIMD and scalar half-precision floating-point conversion" Page 11, Paragraph 1.1.1, line 3: "It provides conversions between 16-bit, 32-bit and 64-bit floating-point formats [...]"

Be careful - you are linking to the internal infocentre rather than the public one.

Always link to http://infocenter.arm.com/help/index.jsp for external reference.

With the variable name change, LGTM. Thanks!

lib/Support/TargetParser.cpp
195	fair enough.

This revision is now accepted and ready to land.Jul 14 2015, 5:55 AM

labrinea added inline comments.Jul 14 2015, 5:55 AM

lib/Support/TargetParser.cpp
195	cortex-a9 Page 10, Paragraph 1.1 third dot: "The Cortex-A9 NEON MPE features are: [...] SIMD and scalar half-precision floating-point conversion" Page 11, Paragraph 1.1.1, line 3: "It provides conversions between 16-bit, 32-bit and 64-bit floating-point formats [...]"

labrinea retitled this revision from Use TargetParser to retrieve default FPU target features to API for retrieving default FPU target features from TargetParser.Jul 14 2015, 6:29 AM

labrinea updated this object.

labrinea edited edge metadata.

javed.absar added inline comments.Jul 14 2015, 6:40 AM

lib/Support/TargetParser.cpp
142	perhaps a comment here to clarify 'default' applies to what (fpu?) ?
202	Are you sure about cortex-r5 having no default fpu. Please see test/CodeGen/ARM/build-attributes.ll where specifying no fpu seems to generate : CORTEX-R5: .fpu vfpv3-d16

rengolin added inline comments.Jul 14 2015, 6:45 AM

lib/Support/TargetParser.cpp
142	That's a good point. It made sense when there was only one field. :)
202	The ARM website says it's optional. It depends on how common is the version with and the one without VFP. But ultimately, yes, we want to keep the same logic everywhere. One of the main goals of this file. :)

Alexandros,

Feel free to submit a new patch with the changes proposed by Javed.

cheers,
--renato

[Phab spring clean] This got accepted.

labrinea added inline comments.Jul 17 2015, 5:24 AM

lib/Support/TargetParser.cpp
208	Having seen test/CodeGen/ARM/build-attributes.ll I can tell that cortex-m4 has VFPv4_SP_D16. That's the reason cortex-m4f is absent. Shall I commit a new patch with this update and remove the cortex-m4f entry?

rengolin added inline comments.Jul 17 2015, 5:43 AM

lib/Support/TargetParser.cpp
208	If you're not using, yes, please. We need to get all ideas about the targets consistent. I wonder why is build-attributes outputting FPV4 to m4. It has to be getting this default from somewhere else, and it should be the job of this method to provide that info. Can you investigate why it is so? I believe whatever method getting the default should be moved to use this table instead. You'll probably find many inconsistencies, that will need to be fixed. :)

javed.absar added inline comments.Jul 17 2015, 6:25 AM

lib/Support/TargetParser.cpp
208	I think it is outputting the fpu name from TargetParser (the FPUnAMES[ ] table): OS << "\t.fpu\t" << ARMTargetParser::getFPUName(FPU) << "\n" and it is reading the build attributes values from include/llvm/Support/ARMBuildAttributes.h: "AllowFPv4B = 6, // v4 FP ISA was permitted,...." which it indexes using e.g ARM::FK_FPV4_SP_D16 => setAttributeItem(ARMBuildAttrs::FP_arch, ARMBuildAttrs::AllowFPv4B In other words, I think you just need to check to ensure that ARMBuildAttributes.h match with TRM and FPUNames[ ] table. But I could be still missing something..

Revision Contents

Path

Size

include/

llvm/

Support/

TargetParser.h

1 line

lib/

Support/

TargetParser.cpp

179 lines

Diff 29668

include/llvm/Support/TargetParser.h

	Show First 20 Lines • Show All 172 Lines • ▼ Show 20 Lines
	public:			public:
	static StringRef getCanonicalArchName(StringRef Arch);			static StringRef getCanonicalArchName(StringRef Arch);

	// Information by ID			// Information by ID
	static const char * getFPUName(unsigned FPUKind);			static const char * getFPUName(unsigned FPUKind);
	static unsigned getFPUVersion(unsigned FPUKind);			static unsigned getFPUVersion(unsigned FPUKind);
	static unsigned getFPUNeonSupportLevel(unsigned FPUKind);			static unsigned getFPUNeonSupportLevel(unsigned FPUKind);
	static unsigned getFPURestriction(unsigned FPUKind);			static unsigned getFPURestriction(unsigned FPUKind);
				static unsigned getDefaultFPU(StringRef CPU);
				rengolinUnsubmitted Not Done Reply Inline Actions A better name for this would be getDefaultFPU(StringRef CPU) rengolin: A better name for this would be getDefaultFPU(StringRef CPU)
	// FIXME: This should be moved to TargetTuple once it exists			// FIXME: This should be moved to TargetTuple once it exists
	static bool getFPUFeatures(unsigned FPUKind,			static bool getFPUFeatures(unsigned FPUKind,
	std::vector<const char*> &Features);			std::vector<const char*> &Features);
	static const char * getArchName(unsigned ArchKind);			static const char * getArchName(unsigned ArchKind);
	static unsigned getArchAttr(unsigned ArchKind);			static unsigned getArchAttr(unsigned ArchKind);
	static const char * getCPUAttr(unsigned ArchKind);			static const char * getCPUAttr(unsigned ArchKind);
	static const char * getSubArch(unsigned ArchKind);			static const char * getSubArch(unsigned ArchKind);
	static const char * getArchExtName(unsigned ArchExtKind);			static const char * getArchExtName(unsigned ArchExtKind);
	Show All 17 Lines

lib/Support/TargetParser.cpp

Show First 20 Lines • Show All 132 Lines • ▼ Show 20 Lines
// List of CPU names and their arches.		// List of CPU names and their arches.
// The same CPU can have multiple arches and can be default on multiple arches.		// The same CPU can have multiple arches and can be default on multiple arches.
// When finding the Arch for a CPU, first-found prevails. Sort them accordingly.		// When finding the Arch for a CPU, first-found prevails. Sort them accordingly.
// When this becomes table-generated, we'd probably need two tables.		// When this becomes table-generated, we'd probably need two tables.
// FIXME: TableGen this.		// FIXME: TableGen this.
struct {		struct {
const char *Name;		const char *Name;
ARM::ArchKind ArchID;		ARM::ArchKind ArchID;
		ARM::FPUKind FPUID; // Default FPU
		rengolinUnsubmitted Not Done Reply Inline Actions Maybe DefaultFPU as a name? rengolin: Maybe DefaultFPU as a name?
		labrineaAuthorUnsubmitted Not Done Reply Inline Actions fair enough labrinea: fair enough
bool Default;		bool Default;
		javed.absarUnsubmitted Not Done Reply Inline Actions perhaps a comment here to clarify 'default' applies to what (fpu?) ? javed.absar: perhaps a comment here to clarify 'default' applies to what (fpu?) ?
		rengolinUnsubmitted Not Done Reply Inline Actions That's a good point. It made sense when there was only one field. :) rengolin: That's a good point. It made sense when there was only one field. :)
} CPUNames[] = {		} CPUNames[] = {
{ "arm2", ARM::AK_ARMV2, true },		{ "arm2", ARM::AK_ARMV2, ARM::FK_NONE, true },
{ "arm3", ARM::AK_ARMV2A, true },		{ "arm3", ARM::AK_ARMV2A, ARM::FK_NONE, true },
{ "arm6", ARM::AK_ARMV3, true },		{ "arm6", ARM::AK_ARMV3, ARM::FK_NONE, true },
{ "arm7m", ARM::AK_ARMV3M, true },		{ "arm7m", ARM::AK_ARMV3M, ARM::FK_NONE, true },
{ "arm8", ARM::AK_ARMV4, false },		{ "arm8", ARM::AK_ARMV4, ARM::FK_NONE, false },
{ "arm810", ARM::AK_ARMV4, false },		{ "arm810", ARM::AK_ARMV4, ARM::FK_NONE, false },
{ "strongarm", ARM::AK_ARMV4, true },		{ "strongarm", ARM::AK_ARMV4, ARM::FK_NONE, true },
{ "strongarm110", ARM::AK_ARMV4, false },		{ "strongarm110", ARM::AK_ARMV4, ARM::FK_NONE, false },
{ "strongarm1100", ARM::AK_ARMV4, false },		{ "strongarm1100", ARM::AK_ARMV4, ARM::FK_NONE, false },
{ "strongarm1110", ARM::AK_ARMV4, false },		{ "strongarm1110", ARM::AK_ARMV4, ARM::FK_NONE, false },
{ "arm7tdmi", ARM::AK_ARMV4T, true },		{ "arm7tdmi", ARM::AK_ARMV4T, ARM::FK_NONE, true },
{ "arm7tdmi-s", ARM::AK_ARMV4T, false },		{ "arm7tdmi-s", ARM::AK_ARMV4T, ARM::FK_NONE, false },
{ "arm710t", ARM::AK_ARMV4T, false },		{ "arm710t", ARM::AK_ARMV4T, ARM::FK_NONE, false },
{ "arm720t", ARM::AK_ARMV4T, false },		{ "arm720t", ARM::AK_ARMV4T, ARM::FK_NONE, false },
{ "arm9", ARM::AK_ARMV4T, false },		{ "arm9", ARM::AK_ARMV4T, ARM::FK_NONE, false },
{ "arm9tdmi", ARM::AK_ARMV4T, false },		{ "arm9tdmi", ARM::AK_ARMV4T, ARM::FK_NONE, false },
{ "arm920", ARM::AK_ARMV4T, false },		{ "arm920", ARM::AK_ARMV4T, ARM::FK_NONE, false },
{ "arm920t", ARM::AK_ARMV4T, false },		{ "arm920t", ARM::AK_ARMV4T, ARM::FK_NONE, false },
{ "arm922t", ARM::AK_ARMV4T, false },		{ "arm922t", ARM::AK_ARMV4T, ARM::FK_NONE, false },
{ "arm9312", ARM::AK_ARMV4T, false },		{ "arm9312", ARM::AK_ARMV4T, ARM::FK_NONE, false },
{ "arm940t", ARM::AK_ARMV4T, false },		{ "arm940t", ARM::AK_ARMV4T, ARM::FK_NONE, false },
{ "ep9312", ARM::AK_ARMV4T, false },		{ "ep9312", ARM::AK_ARMV4T, ARM::FK_NONE, false },
{ "arm10tdmi", ARM::AK_ARMV5T, true },		{ "arm10tdmi", ARM::AK_ARMV5T, ARM::FK_NONE, true },
{ "arm1020t", ARM::AK_ARMV5T, false },		{ "arm1020t", ARM::AK_ARMV5T, ARM::FK_NONE, false },
{ "arm9e", ARM::AK_ARMV5TE, false },		{ "arm9e", ARM::AK_ARMV5TE, ARM::FK_NONE, false },
{ "arm946e-s", ARM::AK_ARMV5TE, false },		{ "arm946e-s", ARM::AK_ARMV5TE, ARM::FK_NONE, false },
{ "arm966e-s", ARM::AK_ARMV5TE, false },		{ "arm966e-s", ARM::AK_ARMV5TE, ARM::FK_NONE, false },
{ "arm968e-s", ARM::AK_ARMV5TE, false },		{ "arm968e-s", ARM::AK_ARMV5TE, ARM::FK_NONE, false },
{ "arm10e", ARM::AK_ARMV5TE, false },		{ "arm10e", ARM::AK_ARMV5TE, ARM::FK_NONE, false },
{ "arm1020e", ARM::AK_ARMV5TE, false },		{ "arm1020e", ARM::AK_ARMV5TE, ARM::FK_NONE, false },
{ "arm1022e", ARM::AK_ARMV5TE, true },		{ "arm1022e", ARM::AK_ARMV5TE, ARM::FK_NONE, true },
{ "iwmmxt", ARM::AK_ARMV5TE, false },		{ "iwmmxt", ARM::AK_ARMV5TE, ARM::FK_NONE, false },
{ "xscale", ARM::AK_ARMV5TE, false },		{ "xscale", ARM::AK_ARMV5TE, ARM::FK_NONE, false },
{ "arm926ej-s", ARM::AK_ARMV5TEJ, true },		{ "arm926ej-s", ARM::AK_ARMV5TEJ, ARM::FK_NONE, true },
{ "arm1136jf-s", ARM::AK_ARMV6, true },		{ "arm1136jf-s", ARM::AK_ARMV6, ARM::FK_VFPV2, true },
{ "arm1176j-s", ARM::AK_ARMV6K, false },		{ "arm1176j-s", ARM::AK_ARMV6K, ARM::FK_NONE, false },
{ "arm1176jz-s", ARM::AK_ARMV6K, false },		{ "arm1176jz-s", ARM::AK_ARMV6K, ARM::FK_NONE, false },
{ "mpcore", ARM::AK_ARMV6K, false },		{ "mpcore", ARM::AK_ARMV6K, ARM::FK_VFPV2, false },
{ "mpcorenovfp", ARM::AK_ARMV6K, false },		{ "mpcorenovfp", ARM::AK_ARMV6K, ARM::FK_NONE, false },
{ "arm1176jzf-s", ARM::AK_ARMV6K, true },		{ "arm1176jzf-s", ARM::AK_ARMV6K, ARM::FK_VFPV2, true },
{ "arm1176jzf-s", ARM::AK_ARMV6Z, true },		{ "arm1176jzf-s", ARM::AK_ARMV6Z, ARM::FK_VFPV2, true },
{ "arm1176jzf-s", ARM::AK_ARMV6ZK, true },		{ "arm1176jzf-s", ARM::AK_ARMV6ZK, ARM::FK_VFPV2, true },
{ "arm1156t2-s", ARM::AK_ARMV6T2, true },		{ "arm1156t2-s", ARM::AK_ARMV6T2, ARM::FK_NONE, true },
{ "arm1156t2f-s", ARM::AK_ARMV6T2, false },		{ "arm1156t2f-s", ARM::AK_ARMV6T2, ARM::FK_VFPV2, false },
{ "cortex-m0", ARM::AK_ARMV6M, true },		{ "cortex-m0", ARM::AK_ARMV6M, ARM::FK_NONE, true },
{ "cortex-m0plus", ARM::AK_ARMV6M, false },		{ "cortex-m0plus", ARM::AK_ARMV6M, ARM::FK_NONE, false },
{ "cortex-m1", ARM::AK_ARMV6M, false },		{ "cortex-m1", ARM::AK_ARMV6M, ARM::FK_NONE, false },
{ "sc000", ARM::AK_ARMV6M, false },		{ "sc000", ARM::AK_ARMV6M, ARM::FK_NONE, false },
{ "cortex-a5", ARM::AK_ARMV7A, false },		{ "cortex-a5", ARM::AK_ARMV7A, ARM::FK_NEON_VFPV4, false },
{ "cortex-a7", ARM::AK_ARMV7A, false },		{ "cortex-a7", ARM::AK_ARMV7A, ARM::FK_NEON_VFPV4, false },
{ "cortex-a8", ARM::AK_ARMV7A, true },		{ "cortex-a8", ARM::AK_ARMV7A, ARM::FK_NEON, true },
{ "cortex-a9", ARM::AK_ARMV7A, false },		{ "cortex-a9", ARM::AK_ARMV7A, ARM::FK_NEON_FP16, false },
		rengolinUnsubmitted Not Done Reply Inline Actions Why FP16 for A9? rengolin: Why FP16 for A9?
		labrineaAuthorUnsubmitted Not Done Reply Inline Actions cortex-a9 Page 10, Paragraph 1.1 third dot: "The Cortex-A9 NEON MPE features are: [...] SIMD and scalar half-precision floating-point conversion" Page 11, Paragraph 1.1.1, line 3: "It provides conversions between 16-bit, 32-bit and 64-bit floating-point formats [...]" labrinea: [[ http://arminfo.emea.arm.com/help/topic/com.arm.doc.
		rengolinUnsubmitted Not Done Reply Inline Actions fair enough. rengolin: fair enough.
		labrineaAuthorUnsubmitted Not Done Reply Inline Actions cortex-a9 Page 10, Paragraph 1.1 third dot: "The Cortex-A9 NEON MPE features are: [...] SIMD and scalar half-precision floating-point conversion" Page 11, Paragraph 1.1.1, line 3: "It provides conversions between 16-bit, 32-bit and 64-bit floating-point formats [...]" labrinea: [[ http://infocenter.arm.com/help/topic/com.arm.doc.
{ "cortex-a12", ARM::AK_ARMV7A, false },		{ "cortex-a12", ARM::AK_ARMV7A, ARM::FK_NEON_VFPV4, false },
{ "cortex-a15", ARM::AK_ARMV7A, false },		{ "cortex-a15", ARM::AK_ARMV7A, ARM::FK_NEON_VFPV4, false },
{ "cortex-a17", ARM::AK_ARMV7A, false },		{ "cortex-a17", ARM::AK_ARMV7A, ARM::FK_NEON_VFPV4, false },
{ "krait", ARM::AK_ARMV7A, false },		{ "krait", ARM::AK_ARMV7A, ARM::FK_NEON_VFPV4, false },
{ "cortex-r4", ARM::AK_ARMV7R, true },		{ "cortex-r4", ARM::AK_ARMV7R, ARM::FK_NONE, true },
{ "cortex-r4f", ARM::AK_ARMV7R, false },		{ "cortex-r4f", ARM::AK_ARMV7R, ARM::FK_VFPV3_D16, false },
{ "cortex-r5", ARM::AK_ARMV7R, false },		{ "cortex-r5", ARM::AK_ARMV7R, ARM::FK_NONE, false },
		javed.absarUnsubmitted Not Done Reply Inline Actions Are you sure about cortex-r5 having no default fpu. Please see test/CodeGen/ARM/build-attributes.ll where specifying no fpu seems to generate : CORTEX-R5: .fpu vfpv3-d16 javed.absar: Are you sure about cortex-r5 having no default fpu. Please see test/CodeGen/ARM/build…
		rengolinUnsubmitted Not Done Reply Inline Actions The ARM website says it's optional. It depends on how common is the version with and the one without VFP. But ultimately, yes, we want to keep the same logic everywhere. One of the main goals of this file. :) rengolin: The ARM website says it's optional. It depends on how common is the version with and the one…
{ "cortex-r7", ARM::AK_ARMV7R, false },		// FIXME missing from ARM.td
{ "sc300", ARM::AK_ARMV7M, false },		{ "cortex-r5f", ARM::AK_ARMV7R, ARM::FK_VFPV3_D16, false },
{ "cortex-m3", ARM::AK_ARMV7M, true },		{ "cortex-r7", ARM::AK_ARMV7R, ARM::FK_VFPV3_D16_FP16, false },
{ "cortex-m4", ARM::AK_ARMV7EM, true },		{ "sc300", ARM::AK_ARMV7M, ARM::FK_NONE, false },
{ "cortex-m7", ARM::AK_ARMV7EM, false },		{ "cortex-m3", ARM::AK_ARMV7M, ARM::FK_NONE, true },
{ "cortex-a53", ARM::AK_ARMV8A, true },		{ "cortex-m4", ARM::AK_ARMV7EM, ARM::FK_NONE, true },
		labrineaAuthorUnsubmitted Not Done Reply Inline Actions Having seen test/CodeGen/ARM/build-attributes.ll I can tell that cortex-m4 has VFPv4_SP_D16. That's the reason cortex-m4f is absent. Shall I commit a new patch with this update and remove the cortex-m4f entry? labrinea: Having seen test/CodeGen/ARM/build-attributes.ll I can tell that cortex-m4 has VFPv4_SP_D16.
		rengolinUnsubmitted Not Done Reply Inline Actions If you're not using, yes, please. We need to get all ideas about the targets consistent. I wonder why is build-attributes outputting FPV4 to m4. It has to be getting this default from somewhere else, and it should be the job of this method to provide that info. Can you investigate why it is so? I believe whatever method getting the default should be moved to use this table instead. You'll probably find many inconsistencies, that will need to be fixed. :) rengolin: If you're not using, yes, please. We need to get all ideas about the targets consistent. I…
		javed.absarUnsubmitted Not Done Reply Inline Actions I think it is outputting the fpu name from TargetParser (the FPUnAMES[ ] table): OS << "\t.fpu\t" << ARMTargetParser::getFPUName(FPU) << "\n" and it is reading the build attributes values from include/llvm/Support/ARMBuildAttributes.h: "AllowFPv4B = 6, // v4 FP ISA was permitted,...." which it indexes using e.g ARM::FK_FPV4_SP_D16 => setAttributeItem(ARMBuildAttrs::FP_arch, ARMBuildAttrs::AllowFPv4B In other words, I think you just need to check to ensure that ARMBuildAttributes.h match with TRM and FPUNames[ ] table. But I could be still missing something.. javed.absar: I think it is outputting the fpu name from TargetParser (the FPUnAMES[ ] table): OS << "\t.
{ "cortex-a57", ARM::AK_ARMV8A, false },		// FIXME missing from ARM.td
{ "cortex-a72", ARM::AK_ARMV8A, false },		{ "cortex-m4f", ARM::AK_ARMV7EM, ARM::FK_FPV4_SP_D16, false },
{ "cyclone", ARM::AK_ARMV8A, false },		{ "cortex-m7", ARM::AK_ARMV7EM, ARM::FK_FPV5_D16, false },
{ "generic", ARM::AK_ARMV8_1A, true },		{ "cortex-a53", ARM::AK_ARMV8A, ARM::FK_CRYPTO_NEON_FP_ARMV8, true },
		{ "cortex-a57", ARM::AK_ARMV8A, ARM::FK_CRYPTO_NEON_FP_ARMV8, false },
		{ "cortex-a72", ARM::AK_ARMV8A, ARM::FK_CRYPTO_NEON_FP_ARMV8, false },
		{ "cyclone", ARM::AK_ARMV8A, ARM::FK_CRYPTO_NEON_FP_ARMV8, false },
		{ "generic", ARM::AK_ARMV8_1A, ARM::FK_NEON_FP_ARMV8, true },
// Non-standard Arch names.		// Non-standard Arch names.
{ "iwmmxt", ARM::AK_IWMMXT, true },		{ "iwmmxt", ARM::AK_IWMMXT, ARM::FK_NONE, true },
{ "xscale", ARM::AK_XSCALE, true },		{ "xscale", ARM::AK_XSCALE, ARM::FK_NONE, true },
{ "arm10tdmi", ARM::AK_ARMV5, true },		{ "arm10tdmi", ARM::AK_ARMV5, ARM::FK_NONE, true },
{ "arm1022e", ARM::AK_ARMV5E, true },		{ "arm1022e", ARM::AK_ARMV5E, ARM::FK_NONE, true },
{ "arm1136j-s", ARM::AK_ARMV6J, true },		{ "arm1136j-s", ARM::AK_ARMV6J, ARM::FK_NONE, true },
{ "arm1136jz-s", ARM::AK_ARMV6J, false },		{ "arm1136jz-s", ARM::AK_ARMV6J, ARM::FK_NONE, false },
{ "cortex-m0", ARM::AK_ARMV6SM, true },		{ "cortex-m0", ARM::AK_ARMV6SM, ARM::FK_NONE, true },
{ "arm1176jzf-s", ARM::AK_ARMV6HL, true },		{ "arm1176jzf-s", ARM::AK_ARMV6HL, ARM::FK_VFPV2, true },
{ "cortex-a8", ARM::AK_ARMV7, true },		{ "cortex-a8", ARM::AK_ARMV7, ARM::FK_NEON, true },
{ "cortex-a8", ARM::AK_ARMV7L, true },		{ "cortex-a8", ARM::AK_ARMV7L, ARM::FK_NEON, true },
{ "cortex-a8", ARM::AK_ARMV7HL, true },		{ "cortex-a8", ARM::AK_ARMV7HL, ARM::FK_NEON, true },
{ "cortex-m4", ARM::AK_ARMV7EM, true },		{ "cortex-m4", ARM::AK_ARMV7EM, ARM::FK_NONE, true },
{ "swift", ARM::AK_ARMV7S, true },		{ "swift", ARM::AK_ARMV7S, ARM::FK_NEON_VFPV4, true },
// Invalid CPU		// Invalid CPU
{ "invalid", ARM::AK_INVALID, true }		{ "invalid", ARM::AK_INVALID, ARM::FK_INVALID, true }
};		};

} // namespace		} // namespace

// ======================================================= //		// ======================================================= //
// Information by ID		// Information by ID
// ======================================================= //		// ======================================================= //

Show All 16 Lines
}		}

unsigned ARMTargetParser::getFPURestriction(unsigned FPUKind) {		unsigned ARMTargetParser::getFPURestriction(unsigned FPUKind) {
if (FPUKind >= ARM::FK_LAST)		if (FPUKind >= ARM::FK_LAST)
return 0;		return 0;
return FPUNames[FPUKind].Restriction;		return FPUNames[FPUKind].Restriction;
}		}

		unsigned ARMTargetParser::getDefaultFPU(StringRef CPU) {
		rengolinUnsubmitted Not Done Reply Inline Actions This most definitely must be implemented as a table lookup. That's the whole point of ARMTargetParser is to move from a C++ implementation to a table-gen-like one, so we can move it to table-gen once we finish. Please, convert this to a table lookup. rengolin: This most definitely must be implemented as a table lookup. That's the whole point of…
		for (const auto C : CPUNames) {
		if (CPU == C.Name)
		return C.FPUID;
		}
		return ARM::FK_INVALID;
		}

bool ARMTargetParser::getFPUFeatures(unsigned FPUKind,		bool ARMTargetParser::getFPUFeatures(unsigned FPUKind,
std::vector<const char *> &Features) {		std::vector<const char *> &Features) {

if (FPUKind >= ARM::FK_LAST \|\| FPUKind == ARM::FK_INVALID)		if (FPUKind >= ARM::FK_LAST \|\| FPUKind == ARM::FK_INVALID)
return false;		return false;

// fp-only-sp and d16 subtarget features are independent of each other, so we		// fp-only-sp and d16 subtarget features are independent of each other, so we
// must enable/disable both.		// must enable/disable both.
Show All 13 Lines	bool ARMTargetParser::getFPUFeatures(unsigned FPUKind,
}		}

// FPU version subtarget features are inclusive of lower-numbered ones, so		// FPU version subtarget features are inclusive of lower-numbered ones, so
// enable the one corresponding to this version and disable all that are		// enable the one corresponding to this version and disable all that are
// higher. We also have to make sure to disable fp16 when vfp4 is disabled,		// higher. We also have to make sure to disable fp16 when vfp4 is disabled,
// as +vfp4 implies +fp16 but -vfp4 does not imply -fp16.		// as +vfp4 implies +fp16 but -vfp4 does not imply -fp16.
switch (FPUNames[FPUKind].FPUVersion) {		switch (FPUNames[FPUKind].FPUVersion) {
case ARM::FV_VFPV5:		case ARM::FV_VFPV5:
Features.push_back("+fp-armv8");		Features.push_back("+fp-armv8");
		rengolinUnsubmitted Not Done Reply Inline Actions If you had this in a table you wouldn't need a comment. :) rengolin: If you had this in a table you wouldn't need a comment. :)
		javed.absarUnsubmitted Not Done Reply Inline Actions Why does VFPV4 etc appear below NEON column? Instead of VFPV3_D16/Xd I think the proper nomenclature is VFPV3XD (please see FPUNames table in same file). A qs (probably for Renato), is there a way to ensure consistency between what is declared here (which fpu) and what is stated in ARM.td ? javed.absar: Why does VFPV4 etc appear below NEON column? Instead of VFPV3_D16/Xd I think the proper…
		rengolinUnsubmitted Not Done Reply Inline Actions Hi Javed, There is currently no way to ensure consistency between this file and any table generated file. The ultimate goal of the target parsers is to have all tables generated from the .td files, but for that to happen, we need to create a new TableGen back-end that is fast and produce those tables for all* targets even if no back-end is built in, due to Clang needing them regardless. Once the tables are stable, we can start moving them to tablegen and add a general rule to CMake to always build them. Alexandros, Feel free to add the missing CPUs with the respective FPU/NEON supports to the main table. cheers, --renato rengolin: Hi Javed, There is currently no way to ensure consistency between this file and any table…
break;		break;
case ARM::FV_VFPV4:		case ARM::FV_VFPV4:
Features.push_back("+vfp4");		Features.push_back("+vfp4");
Features.push_back("-fp-armv8");		Features.push_back("-fp-armv8");
break;		break;
case ARM::FV_VFPV3_FP16:		case ARM::FV_VFPV3_FP16:
Features.push_back("+vfp3");		Features.push_back("+vfp3");
Features.push_back("+fp16");		Features.push_back("+fp16");
▲ Show 20 Lines • Show All 306 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

API for retrieving default FPU target features from TargetParserClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 29668

include/llvm/Support/TargetParser.h

lib/Support/TargetParser.cpp

API for retrieving default FPU target features from TargetParser
ClosedPublic