This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Adding "armv8.8-a" memcpy/memset support.
ClosedPublic

Authored by tmatheson on Dec 22 2021, 3:22 AM.

Details

Summary

This change introduces the Arm 8.8-A instructions for FEAT_MOPS. Documentation for these instructions can be found in the A64 ISA documentation.

There are a lot of variants so I won't link to all of the documentation individually here. There are four basic operations with the following prefixes:

  • CPYP*: Memory copy
  • CPYF*: Memory copy, forward-only
  • SET*: Memory Set
  • SETG*: Memory Set with tag setting, which requires FEAT_MTE in addition to FEAT_MOPS

For each of these operations there are three instructions differentiated by P/M/E for preconditioning/main/end. For example, SETGP, SETGM, SETGE.

For each set of three instructions there are also variants for privileged/unprivileged and temporal/non-temporal e.g. CPYPTN, CPYMTN, CPYETN

For the copy instructions, you can separately specify the read and write translations (so that kernels can safely use these instructions in syscall handlers, to memcpy between the calling process's user-space memory map and the kernel's own privileged one).

Some of these instructions write back to multiple registers to indicate progress in the event of interrupts. Contrary to the previous description of these instructions, they are not required to be placed in a loop.

Patch by Simon Tatham, Victor Campos, Tomas Matheson and Sam Elliott

Diff Detail

Event Timeline

tmatheson created this revision.Dec 22 2021, 3:22 AM
tmatheson requested review of this revision.Dec 22 2021, 3:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 22 2021, 3:22 AM

The latest publicly accessible Arm ARM doesn't yet mention the v8.8 extensions, IIUC.
https://developer.arm.com/documentation/ddi0487/gb/
Is there perhaps a draft of the next revision I can refer to to review this patch?

tmatheson edited the summary of this revision. (Show Details)Dec 30 2021, 12:03 AM

I've updated the description with documentation.

dmgreen added inline comments.Jan 3 2022, 2:38 AM
llvm/lib/Target/AArch64/AArch64InstrFormats.td
11455

Should this be hasNoSchedulingInfo? Can we remove this or include something very basic?

Should something here be marking them MayLoad and MayStore? They may already be hasSideEffects, as nothing marks them otherwise and they do not have codegen patterns, which is more strict. But it's probably good to be explicit.

tmatheson updated this revision to Diff 397066.Jan 3 2022, 7:58 AM

Add mayLoad/mayStore/hasSideEffects

tmatheson added inline comments.Jan 3 2022, 8:01 AM
llvm/lib/Target/AArch64/AArch64InstrFormats.td
11455

Should this be hasNoSchedulingInfo? Can we remove this or include something very basic?

I'm not sure why this is set, @simon_tatham might remember. My guess would be that the alternative is adding Sched<[]> but that doesn't seem to work on the multiclass - CodeGenSchedModels::checkCompleteness() does not recognise the final defs as having the Sched subclass.

Matt added a subscriber: Matt.Jan 4 2022, 9:39 AM
simon_tatham added inline comments.Jan 5 2022, 2:18 AM
llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
4975

Surely this ought to say "invalid CPY instruction", and similarly the two below it?

simon_tatham added inline comments.Jan 5 2022, 2:21 AM
llvm/lib/Target/AArch64/AArch64InstrFormats.td
11455

No, sorry – I think when I wrote the internal version of this patch I just did whatever would most quickly get Tablegen to stop complaining about incomplete scheduling models. If @dmgreen thinks it's wrong, I'd trust him over me :-)

tmatheson marked an inline comment as done.Jan 5 2022, 2:29 AM
tmatheson updated this revision to Diff 397530.Jan 5 2022, 4:41 AM

Use Sched<[]> rather than hasNoSchedulingInfo. This requried updating the regexes used by various scheduling models to apply InstRW to CPY* instructions so that they do not capture the new MOPS instructions.

tmatheson marked 2 inline comments as done.Jan 5 2022, 4:43 AM
dmgreen accepted this revision.Jan 5 2022, 5:15 AM

Use Sched<[]> rather than hasNoSchedulingInfo. This requried updating the regexes used by various scheduling models to apply InstRW to CPY* instructions so that they do not capture the new MOPS instructions.

Yeah, I got confused as to why this worked differently to the other new instructions like ST64B. Enough to go and have a look at what was going on, and came to the same conclusion.

I think it's worth renaming the old "CPY" instruction just for clarity. It is not anything to do with any CPY instructions after all. I put up D116655 for that.

Otherwise that and the other point here, this patch LGTM.

llvm/lib/Target/AArch64/AArch64InstrFormats.td
11456

Sorry, I should have been more clear. I'm not sure this actually needs hasSideEffects. I was just trying to explain that it would already have it set (implicitly), so missing mayLoad/mayStore would not technically be a problem, as it already had sideeffects. With mayLoad and mayStore, I don't believe there are any other side effects of these instructions. They just update registers and load/store memory.

This revision is now accepted and ready to land.Jan 5 2022, 5:15 AM

Oh. I just thought that these instructions will unlikely be scheduled if we are generating then from pseudos. Perhaps the scheduling info doesn't matter either way.

Should they be setting or using NZCV though?

Should they be setting or using NZCV though?

Not as far as I am aware.

I'll remove the hasSideEffects before landing.

Thanks everyone for the comments.

This revision was landed with ongoing or failed builds.Jan 5 2022, 6:45 AM
This revision was automatically updated to reflect the committed changes.