This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE2] Add the SVE2.1 quadword variants of ld1w/ld1d/st1w/st1d
ClosedPublic

Authored by david-arm on Nov 2 2022, 5:13 AM.

Details

Summary

This patch adds the assembly/disassembly for the following instructions:

st1w: Contiguous store words from vector (128-bit vector elements)
st1d: Contiguous store doublewords from vector (128-bit vector elements)
ld1w: Contiguous load unsigned words to vector (128-bit vector elements)
ld1d: Contiguous load unsigned doublewords to vector (128-bit vector elements)

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09

Diff Detail

Event Timeline

david-arm created this revision.Nov 2 2022, 5:13 AM
Herald added a project: Restricted Project. · View Herald Transcript
david-arm requested review of this revision.Nov 2 2022, 5:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 2 2022, 5:13 AM
c-rhodes added inline comments.
llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td
21 ↗(On Diff #472582)

most of SVE is supported by N2, this should be [HasSVE2p1]

paulwalker-arm added inline comments.Nov 3 2022, 6:27 AM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
1299

As with previous comments I'm guessing you only using this so you don't get the alias, but I think we're ok having the alias.

llvm/lib/Target/AArch64/SVEInstrFormats.td
5782–5788

Do we need this class? It looks like the only difference is the support aliases, but I don't see a reason the quad case need to be more restrictive.

9227–9229

Given they're structurally identical what about extending sve_mem_cst_si to include an extra q bit. This will also mean the new instructions have a matching set of InstAliases. This also matches the path you've been able to take regarding the stores.

9256

Perhaps just extend sve_mem_cld_ss_base? I'd be tempted to extend sve_mem_cld_ss also, but that's up to you.

Matt added a subscriber: Matt.Nov 5 2022, 8:33 PM
david-arm added inline comments.Nov 6 2022, 11:02 PM
llvm/lib/Target/AArch64/SVEInstrFormats.td
9227–9229

I assume you mean sve_mem_cld_si_base, not sve_mem_cst_si since they are stores? If you're referring to sve_mem_cld_si_base then they're not quite structurally the same since the dtype field is only bits 24-23 for quadwords. By bringing them together it does become a bit confusing because treating the quadword encoding group as having a 24-21 bit dtype field leads to exactly the same dtypes we have for LD1SB (1100) and LD1SH (1000).

Reusing the classes reduces code for sure, but it might make it more confusing. I'll give it a try anyway and see how it looks.

david-arm updated this revision to Diff 473582.Nov 7 2022, 12:14 AM
david-arm marked 3 inline comments as done.
  • Attempted to reuse as many existing classes as possible.
  • Added aliases for all new loads and stores that don't require the braces around the vector register, e.g. ld1w z3.q, ...
  • Added tests for all additional aliases.
david-arm marked an inline comment as done.Nov 7 2022, 12:15 AM
david-arm added inline comments.
llvm/lib/Target/AArch64/SVEInstrFormats.td
9227–9229

Hmm, the problem with reusing the classes is that it requires inverting the meaning of the nf bit. In sve_mem_cld_si_base the nf is bit 20, which is set to 0b0 for normal non-quadword loads, i.e. LD1W_IMM, however for sve_mem_128b_cld_si bit 20 is 0b1! It seems a bit odd to say that a faulting quadword load has nf=1. If you don't mind I'll keep the existing class as it is?

9256

Again, I think this doesn't really help because the quadword loads (sve_mem_128b_cld_ss) and other loads (sve_mem_cld_ss_base) have different bits 15-14:

sve_mem_128b_cld_ss: 0b10
sve_mem_cld_ss_base: 0b01

It seems to me like the two encoding groups are not close enough to be reused easily.

paulwalker-arm accepted this revision.Nov 7 2022, 5:10 AM
paulwalker-arm added inline comments.
llvm/lib/Target/AArch64/SVEInstrFormats.td
9227–9229

I think you're putting a bit too much stock into what these things are called but sure, I can see how there's a bit more going on here than normal so if this is your preference then yes this works for me. Thanks for investigating.

This revision is now accepted and ready to land.Nov 7 2022, 5:10 AM