This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][GlobalISel] CodeGen for Armv8.8/9.3 MOPS
ClosedPublic

Authored by tyb0807 on Jan 20 2022, 2:22 AM.

Details

Summary

This implements codegen for Armv8.8/9.3 Memory Operations extension (MOPS).
Any memcpy/memset/memmov intrinsics will always be emitted as a series
of three consecutive instructions P, M and E which perform the
operation. The SelectionDAG implementation is split into a separate
patch.

AArch64LegalizerInfo will now consider the following generic opcodes
if +mops is available, instead of legalising by expanding them to
libcalls: G_BZERO, G_MEMCPY_INLINE, G_MEMCPY, G_MEMMOVE, G_MEMSET
The s8 value of memset is legalised to s64 to match the pseudos.

AArch64O0PreLegalizerCombinerInfo will still be able to combine
G_MEMCPY_INLINE even if +mops is present, as it is unclear whether it is
better to generate fixed length copies or MOPS instructions for the
inline code of small or zero-sized memory operations, so we choose to be
conservative for now.

AArch64InstructionSelector will select the above as new pseudo
instructions: AArch64::MOPSMemory{Copy/Move/Set/SetTagging} These are
each expanded to a series of three instructions (e.g. SETP/SETM/SETE)
which must be emitted together during code emission to avoid scheduler
reordering.

This is part 3/4 of a series of patches split from
https://reviews.llvm.org/D117405 to facilitate reviewing.

Patch by Tomas Matheson and Son Tuan Vu

Diff Detail

Event Timeline

tyb0807 created this revision.Jan 20 2022, 2:22 AM
tyb0807 requested review of this revision.Jan 20 2022, 2:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 20 2022, 2:22 AM
dmgreen added inline comments.
llvm/lib/Target/AArch64/GISel/AArch64O0PreLegalizerCombiner.cpp
90

Should these still be returning false, or are fixed length copies better left for the inline code? The quality of the code matter less at O0, but we are likely better off being conservative for the time being, like we are for ISel.

llvm/unittests/Target/AArch64/InstSizes.cpp
167

Nice test. Can you format it to limit the length of the line?

tyb0807 marked 2 inline comments as done.Jan 20 2022, 3:35 PM
tyb0807 updated this revision to Diff 401853.Jan 20 2022, 9:15 PM

Be conservative on whether it is better to use MOPS instructions for the inline
code of small or zero-sized memory operations, i.e. we only emit MOPS
instructions for any time we used to emit an inline call to memcopy.

tyb0807 updated this revision to Diff 401856.Jan 20 2022, 9:16 PM
tyb0807 edited the summary of this revision. (Show Details)

Update commit message to reflect latest changes

tyb0807 updated this revision to Diff 401891.Jan 21 2022, 1:06 AM
tyb0807 edited the summary of this revision. (Show Details)

Update commit message

tyb0807 updated this revision to Diff 402887.Jan 25 2022, 6:23 AM

clang-format tests

Matt added a subscriber: Matt.Jan 25 2022, 3:11 PM
dmgreen added inline comments.Jan 26 2022, 6:47 AM
llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
3429

Why does this have a case for G_BZERO if selectMOPS asserts that opcode != G_BZERO?

llvm/lib/Target/AArch64/GISel/AArch64O0PreLegalizerCombiner.cpp
94

This one too? It may want to prevent turning it into a bzero in this case though.

llvm/test/CodeGen/AArch64/aarch64-mops-mte.ll
3

Can you add a run line without -O0, as they run through different code. Using --check-prefixes can help if they have identical output.

llvm/unittests/Target/AArch64/InstSizes.cpp
25

All of this seems unrelated?

tyb0807 updated this revision to Diff 403483.Jan 26 2022, 8:16 PM
tyb0807 marked 4 inline comments as done.

Be conservative on when to inline mem* functions. Remove dead code. Add more tests

llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
3429

Good point. Thanks for catching this. I'm also removing the assert, as there's no other way expandMOPS is gonna be called.

llvm/unittests/Target/AArch64/InstSizes.cpp
25

This is the result of clang-format -i, I guess this is more correct than the older format

dmgreen added inline comments.Jan 27 2022, 1:21 AM
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
945

I think this assert is still valid, isn't it?

llvm/test/CodeGen/AArch64/aarch64-mops-mte.ll
14

These instructions are invalid without MTE. The test should keep +mte

llvm/test/CodeGen/AArch64/aarch64-mops.ll
4–5

perhaps use --check-prefixes=GISel-WITHOUT-MOPS;GISel-WITHOUT-MOPS-O0 and --check-prefixes=GISel-WITHOUT-MOPS;GISel-WITHOUT-MOPS-O3. Same for the GISel-MOPS below. The tests with identical output can share the check lines then, which keeps things a little cleaner.

tyb0807 marked 2 inline comments as done.Jan 27 2022, 2:05 AM
tyb0807 added inline comments.
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
945
llvm/test/CodeGen/AArch64/aarch64-mops-mte.ll
14

See reply above for removing hasMTE assertion

dmgreen added inline comments.Jan 27 2022, 5:03 AM
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
945

This can't just generate instructions for an architecture that is not enabled. That is an error (even if it doesn't currently give an error - there should be a verifier check for it, but it only works in certain situations at the moment). It doesn't matter a lot if the assert is there or not, but it should be invalid to get here without the architecture extension (from clang).

The spec sounds wrong to me - it doesn't make much sense to define an intrinsic that can't be used under the current architecture.

tyb0807 updated this revision to Diff 404158.EditedJan 28 2022, 1:44 PM
tyb0807 marked 4 inline comments as done.

Refactor tests and update accordingly to change from ACLE specification: __builtin_arm_mops_memset_tag requires _both_ MOPS and MTE features

llvm/unittests/Target/AArch64/InstSizes.cpp
25

Reverted these formatting, according to https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access (point 2). Thanks to @lenary for pointing this out

tyb0807 updated this revision to Diff 404301.Jan 29 2022, 12:54 PM

Remove unused declaration

dmgreen accepted this revision.Jan 30 2022, 12:59 PM

Thanks. I'm not the biggest Global-ISel expert, but from what I can tell this looks good to me.

This revision is now accepted and ready to land.Jan 30 2022, 12:59 PM
This revision was landed with ongoing or failed builds.Jan 31 2022, 12:54 PM
This revision was automatically updated to reflect the committed changes.