This is an archive of the discontinued LLVM Phabricator instance.

[Clang][AArch64] Add support for -mcpu=grace
ClosedPublic

Authored by SjoerdMeijer on Oct 21 2022, 12:44 AM.

Details

Summary

This adds Clang command line support for the NVIDIA Grace CPU [1], which we would like to target with -mcpu=grace.

Grace is based on the Arm Neoverse V2 CPU, see also [1], which is why in the driver we pass neoverse-v2 to the compiler when grace is requested. Thus, for now, it's an alias to the Neoverse V2.

We will probably follow with a few more patches in this area (e.g. default to grace given a certain triple), but this is probably a good first step

[1] https://www.nvidia.com/en-us/data-center/grace-cpu/

Diff Detail

Event Timeline

SjoerdMeijer created this revision.Oct 21 2022, 12:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 21 2022, 12:44 AM
SjoerdMeijer requested review of this revision.Oct 21 2022, 12:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 21 2022, 12:44 AM
Herald added a subscriber: MaskRay. · View Herald Transcript
sushgokh added inline comments.Oct 21 2022, 1:17 AM
llvm/include/llvm/Support/AArch64TargetParser.def
247 ↗(On Diff #469489)

Can you check if the test case compiles without this because we are already returning "neoverse-v2" at frontend in clang/lib/Driver/ToolChains/Arch/AArch64.cpp ?

SjoerdMeijer added inline comments.Oct 21 2022, 1:35 AM
llvm/include/llvm/Support/AArch64TargetParser.def
247 ↗(On Diff #469489)

This is a very good point. And I tried this first, but the result was that we then first get a "unrecognised cpu" warning. Thus, I found it easier to just add the definition here.

The other thing about this is that we don't necessarily need the "full definition" with all the architecture extensions, but thought that when I add I might as well be complete.

sushgokh added inline comments.Oct 21 2022, 4:55 AM
llvm/include/llvm/Support/AArch64TargetParser.def
247 ↗(On Diff #469489)

Check if this helps:

@@ -124,6 +128,9 @@ static bool DecodeAArch64Mcpu(const Driver &D, StringRef Mcpu, StringRef &CPU,

if (CPU == "native")
  CPU = llvm::sys::getHostCPUName();

+ if(CPU == "grace")
+ CPU="neoverse-v2";
+

if (CPU == "generic") {
  Features.push_back("+neon");
dmgreen added a subscriber: ktkachov.

I am not against this, but as @ktkachov said it sets a new precedence. We have not in the past added CPU names which are aliases.

If we are doing this, can we build a better way of adding aliases? Maybe a list in TargetParser.def that gets queried at the relevant places?

I am not against this, but as @ktkachov said it sets a new precedence. We have not in the past added CPU names which are aliases.

If we are doing this, can we build a better way of adding aliases? Maybe a list in TargetParser.def that gets queried at the relevant places?

There are a few unknowns here I think (at least for me):

  • Are we going to have (a lot) more aliases? I don't know, but I would guess not.
  • Do we later want to distinguish V2 from Grace (for one reason or another)? I don't know yet, but maybe.

I am happy to investigate how we could introduce aliases, but then again, not sure how useful that would be given what I mentioned above; that's the reason I went for this approach, as a one-off, for now.

But let me know what you think, and am happy to look.

I am not against this, but as @ktkachov said it sets a new precedence. We have not in the past added CPU names which are aliases.

If we are doing this, can we build a better way of adding aliases? Maybe a list in TargetParser.def that gets queried at the relevant places?

There are a few unknowns here I think (at least for me):

  • Are we going to have (a lot) more aliases? I don't know, but I would guess not.
  • Do we later want to distinguish V2 from Grace (for one reason or another)? I don't know yet, but maybe.

I am happy to investigate how we could introduce aliases, but then again, not sure how useful that would be given what I mentioned above; that's the reason I went for this approach, as a one-off, for now.

But let me know what you think, and am happy to look.

I think having an alias capability would be useful, particularly as we sometimes want to enable codenames before an official CPU launch (like happened with "zeus" and "demeter" in GCC) and want to retain the codenames for backwards compatibility with released toolchains. Having "grace" as an explicit alias of "neoverse-v2" would also ensure that any improvements folks make to its tuning benefit "neoverse-v2" as well. If they need to deviate in the future for some good reason then we can be explicit about it and split them into separate CPU options.

I don't know the background on the wider need for aliases. Makes sense to me in this specific case.

I would also rather this information go into the target parser. Putting it in the .def file is going to be way too much work for this one thing, so just a function in AArch64TargetParser.h like resolveCPUAlias? It can be just if grace return v2 else return the original name, it will be one less piece of target info to track down if/when we modernise the target parsers.

(if we have more aliases in future then yes we could extend the .def records and just implement resolveCPUAlias in terms of those)

Following Shushant's suggestion, the number of changes have been reduced to basically a 2-liner:

  • Grace no longer exits as CPU definition in TargetParser,
  • the downside is that it no longer appears as a suggestion of a valid CPU when an invalid one is given (clang/test/Misc/target-invalid-cpu-note.c)

I think having an alias capability would be useful, particularly as we sometimes want to enable codenames before an official CPU launch (like happened with "zeus" and "demeter" in GCC) and want to retain the codenames for backwards compatibility with released toolchains. Having "grace" as an explicit alias of "neoverse-v2" would also ensure that any improvements folks make to its tuning benefit "neoverse-v2" as well. If they need to deviate in the future for some good reason then we can be explicit about it and split them into separate CPU options.

Ok, I see there's a use-case for this. Not sure how often code-names comes up for us though, but hey, I could use an alias capability here. So how about this as a strategy. This patch has been reduced to a 2-liner, so impact on the codebase is minimal. I have placed a FIXME that CPU aliases should be supported in TargetParser. Can we support grace like this, while I work on your feature request (free of charge ;-))? A new feature requires a bit of surgery here and there, and it would be nice if grace support doesn't depend on that. Once we have the infrastructure, I will port grace over to that as a first use-case.

What do you think?

I'm in agreement with @DavidSpickett - any resolution of aliases has to be done using functionality in AArch64TargetParser, rather than adding special cases spread around the rest of the LLVM and Clang codebase.

I didn't see David's reply until I posted mine.
Sure, I am happy with that, I will add a little helper function!

@SjoerdMeijer can you make sure that llc -mcpu=grace also works, in the next version of the patch? It doesn't need a patch, but it will prove that you have aliases right.

I am hoping it is fairly simple to just a table that is queried in places like getAArch64TargetCPU and fillValidCPUArchList. That way you could get the cpu list working too.

It might be useful to add a test that -mtune=grace works as expected too.

clang/lib/Driver/ToolChains/Arch/AArch64.cpp
132

Formatting.

You cannot put grace into this file:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64.td
With the same settings as a neoverse-v2?

You cannot put grace into this file:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64.td
With the same settings as a neoverse-v2?

This is duplication we want to avoid, like wanting to avoid the duplication in AArch64TargetParser.def that was in a previous version of this patch.

This now uses a new helper resolveCPUAlias.

can you make sure that llc -mcpu=grace also works, in the next version of the patch?

Hm, this is a good point. And, of course, it doesn't work yet, that would have been too easy.
The problem of course is that options parsing is all over the place, the MC layers has its own bits and pieces, which doesn't handle this yet so would need to resolve aliases in some way.

So what this patch achieves is that the user-facing tool Clang understands mcpu=grace, but the developer tools (opt, llc) not yet. That is certainly not ideal, but probably something I can live with as a first step.

You cannot put grace into this file:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64.td
With the same settings as a neoverse-v2?

In my first revision I had a grace definition in AArch64TargetParser.def. Fair enough, that was the Clang side, not LLVM, but point is that if a CPU is pure alias we want to map onto an existing definition, which we don't achieve in that way.

I think I am half happy for now; this patch addresses the Clang part, the not yet existing LLVM part counter part needs looking into.

DavidSpickett accepted this revision.Oct 24 2022, 1:06 AM

This LGTM given you just want support in clang. New helper is tested by the driver test, if we get more aliases later we can add isolated tests for it.

(anyone else feel free to object)

This revision is now accepted and ready to land.Oct 24 2022, 1:06 AM

Thanks for your help and reviews!

There's more to do this in this area, will probably follow up on this, but am happy with this as a first step.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptOct 24 2022, 5:26 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript