This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add support for the 'R' architecture profile.
ClosedPublic

Authored by labrinea on Sep 20 2021, 4:44 AM.

Details

Summary

This change introduces subtarget features to predicate certain instructions and system registers that are available only on 'A' profile targets. Those features are not present when targeting a generic CPU, which is the default processor.

In other words the generic CPU now means the intersection of 'A' and 'R' profiles. To maintain backwards compatibility we enable the features that correspond to -march=armv8-a when the architecture is not explicitly specified on the command line.

References: https://developer.arm.com/documentation/ddi0600/latest

Diff Detail

Event Timeline

labrinea created this revision.Sep 20 2021, 4:44 AM
labrinea requested review of this revision.Sep 20 2021, 4:44 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptSep 20 2021, 4:44 AM
pratlucas added inline comments.
llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
3301

As features can now depend on HasV8_0aOps, does it make sense to cover it here?

llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
1594–1598

Nit: Maybe extract this block to a function as it's used a few times.

labrinea updated this revision to Diff 374572.Sep 23 2021, 8:46 AM
labrinea marked 2 inline comments as done.

I wanted to clarify the chosen strategy as the desciption was perhaps not very informative. The are some instructions and system registers that are present in v8-a but not in v8-r, and so I am inclined to turn the generic cpu into the intersection of the two architecture profiles. The reasoning is to stay inline with the existing driver strategy: even when -march is specified on the command line clang passes -taget-cpu=generic to the backend, plus whatever subtarget features are implied by -march. The only exception to this rule seems to be AArch32 v8-r, where cortex-r52 is prefered over generic. This is an inconsistency and I wouldn't want to repeat the same for AArch64 (unless it's really necessary). That said my current approach will be breaking the current tools behavior: when the user only specifies the triple and not -march then they will be targeting the intersection, not v8-a. The impact is not expected to be significant as the missing instructions (smc, dcps3) were not being generated anyway. However if the two profiles significantly diverge in the future it might become a problem. There are some alternative solutions to this:

  • make the clang driver implicitly pass -march=armv8-a when only the triple is specified; then all the v8-a subtarget features will be enabled allowing smc, dcps3 and the system registers to be available (my preference)
  • leave the generic cpu as is (so that it means v8-a, not the intersection), and make the clang driver set -target-cpu with the default cpu for a given architecture (according to the Target Parser) when -march is specified; this change will also be breaking the current tools behavior though (middle ground option)
  • introduce a new generic-r cpu (or use cortex-r82) when setting -target-cpu, but only when -march=armv8-r, otherwise choose generic (least favorite option)

I am adding a few people for visibility, requesting comments @t.p.northover @srhines @nickdesaulniers

That said my current approach will be breaking the current tools behavior: when the user only specifies the triple and not -march then they will be targeting the intersection, not v8-a.

For the purposes of compiling the linux kernel, for the ARCH=arm64 port, we always pass -Wa,-march=armv8.5-a but not -march=. While it's trivial to add it to the kernel Makefiles, I worry if this causes clang's behavior to differ significantly from GCC. Does GCC support the R profile?

The impact is not expected to be significant as the missing instructions (smc, dcps3) were not being generated anyway.

I can see in the kernel sources that hypervisor code in KVM is using smc instructions. kvm/hyp/hyp-entry.S Since we're passing -Wa,-march=, I suspect with your patch we'd be fine.

tmatheson added inline comments.
clang/lib/Basic/Targets/AArch64.h
62

The equivalent in the ARM backend is named getCPUProfile

tmatheson added inline comments.Oct 1 2021, 2:52 AM
clang/lib/Basic/Targets/AArch64.h
62

That's arguably a worse name though.

labrinea updated this revision to Diff 376942.Oct 4 2021, 10:32 AM
labrinea edited the summary of this revision. (Show Details)

Change from last revision: The driver implicitly enables the 'A' profile features (as if -march=armv8-a was specified on the command line) when only the target triple is specified in order to maintain backwards compatibility.

Matt added a subscriber: Matt.Oct 5 2021, 12:42 PM

Looking at the behaviour of gcc it looks like there -march=armv8-r enables instructions that are present in 8-r but not 8-a, but doesn't disable any instructions. So e.g. it will accept the dcps3 instruction when compiling with -march=armv8-r whereas clang won't. That's probably a bug in gcc though.

john.brawn added inline comments.Oct 12 2021, 8:49 AM
llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
1554

It would be better if we had a generic way to handle registers with overlapping encodings, instead of handling the two registers explicitly here. I'm not sure of the best way to do that, but looking at AArch64SystemOperands.td it looks like maybe a way to do it would to add an extra "AltName" field to give an alternate name for the same encoding, so e.g. TTBR0_EL2 would have AltName
VSCTLR_EL2 and vice-versa. So you'd first lookup by encoding, then if that didn't work you'd lookup by name with AltName and check if that one is valid.

labrinea updated this revision to Diff 380171.Oct 16 2021, 4:32 AM

Added an alternative name to indicate sytem register aliasing.

labrinea marked an inline comment as done.Oct 16 2021, 4:34 AM
labrinea added inline comments.
llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
1554

Done, but it may turn problematic if we start having more than one alternative names, i.e. if multiple architecture extensions reference the same encoding using a different name.

john.brawn accepted this revision.Oct 25 2021, 5:56 AM

I have one small comment, but otherwise LGTM.

llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
1558

It would be good to have a comment here explaining what's going on.

This revision is now accepted and ready to land.Oct 25 2021, 5:56 AM
labrinea updated this revision to Diff 382288.Oct 26 2021, 6:38 AM
labrinea marked an inline comment as done.

Added a comment explaining system register lookups by alternative name as suggested and rebased on top of https://reviews.llvm.org/D111551. @john.brawn you may want to re-review the patch as the rebase had conflicts in llvm/lib/Target/AArch64/AArch64.td.

labrinea updated this revision to Diff 382294.Oct 26 2021, 7:08 AM

Added v8.1a_ops on AppleA10. However, the target parser lists it as v8.0a, which seems odd. Out of the scope of this patch anyway.

Added v8.1a_ops on AppleA10. However, the target parser lists it as v8.0a, which seems odd. Out of the scope of this patch anyway.

Looking at the definition of A10 in target parser it doesn't have LSE, and that's required in 8.1-a. So it looks like it should have HasV8_0aOps.

labrinea updated this revision to Diff 382584.Oct 27 2021, 3:00 AM

Changed AppleA10 to HasV8_0aOps

This revision was automatically updated to reflect the committed changes.