This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SME] Add ldr/str (fill/spill) intrinsics
ClosedPublic

Authored by david-arm on Jun 8 2022, 8:59 AM.

Details

Summary

This patch adds implementations for the fill/spill SME ACLE intrinsics:

@llvm.aarch64.sme.ldr
@llvm.aarch64.sme.str

Diff Detail

Event Timeline

david-arm created this revision.Jun 8 2022, 8:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 8 2022, 8:59 AM
david-arm requested review of this revision.Jun 8 2022, 8:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 8 2022, 9:00 AM
aemerson added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
2354

Nit: Comments usually start with capitals.

llvm/lib/Target/AArch64/AArch64InstrFormats.td
3210 ↗(On Diff #435164)

s/offset/Offset

llvm/test/CodeGen/AArch64/SME/sme-intrinsics-loads.ll
265

I'm not following why a byte offset from a pointer would lower into an offset scaled by VL?

david-arm added inline comments.Jun 9 2022, 1:08 AM
llvm/test/CodeGen/AArch64/SME/sme-intrinsics-loads.ll
265

That's an excellent point @aemerson! Something has clearly gone wrong here - I'll investigate!

david-arm updated this revision to Diff 435487.Jun 9 2022, 3:43 AM
  • Fixed bug in isel pattern for imm offset forms of ldr/str
david-arm marked 2 inline comments as done.Jun 9 2022, 3:43 AM
aemerson accepted this revision.Jun 9 2022, 10:45 AM

LGTM.

This revision is now accepted and ready to land.Jun 9 2022, 10:45 AM
c-rhodes added inline comments.Jun 10 2022, 1:35 AM
llvm/lib/Target/AArch64/SMEInstrFormats.td
25

I think s4 implies 4-bit signed?

This revision was landed with ongoing or failed builds.Jun 14 2022, 5:58 AM
This revision was automatically updated to reflect the committed changes.
bryanpkc added inline comments.
llvm/test/CodeGen/AArch64/SME/sme-intrinsics-stores.ll
289

@david-arm How does the VL multiplier in the address operand affect the vector offset? Could you explain why za[w12, 0] through za[w12, 15] are selected for VL multipliers 0-15, but for all multipliers larger than 16, only za[w12, 0] is selected (w12 remains zero in all cases)? I understand that ACLE intrinsics won't actually generate slice offsets larger than 15, but the behaviour of the LLVM intrinsics is confusing (as they do not accept an explicit slice offset argument).

sdesmalen added inline comments.Mar 29 2023, 4:04 AM
llvm/test/CodeGen/AArch64/SME/sme-intrinsics-stores.ll
289

I'm not sure if this answers your question, but it seemed we had a fix for the stores that we hadn't pushed upstream yet. It avoids creating e.g. str za[w12, 0], [x0, #15, mul vl], since the immediates must match up. The fix can be found here: D147136.

bryanpkc added inline comments.Mar 29 2023, 9:47 AM
llvm/test/CodeGen/AArch64/SME/sme-intrinsics-stores.ll
289

Thanks! I noticed that discrepancy between LDR and STR later yesterday too. But how the LLVM intrinsics are designed to work is still not clear to me. Determining the slice offset based on the addend in the address operand, and resetting the slice offset to 0 if the addend is larger than 256 bytes, seems rather arbitrary. Am I missing something?

rsandifo-arm added inline comments.
llvm/test/CodeGen/AArch64/SME/sme-intrinsics-loads.ll
280

I think this is the same as @bryanpkc's comment, but I'm not sure how this can legitimately become:

mov w12, wzr
ldr za[w12, 15], [x0, #15, mul vl]

If the intrinsic uses a single argument for the ZA index, shouldn't the register part of the index be adjusted to match the immediate offset? I.e. shouldn't w12 + 15 == 0, to match the 0 argument to the intrinsic?

david-arm added inline comments.Apr 3 2023, 1:35 AM
llvm/test/CodeGen/AArch64/SME/sme-intrinsics-loads.ll
280

Hmm, yes I see your point. The instruction encoding is a little odd in that there is a single offset immediate used for both the vector select and for the address offset. I thought we had a fix for one of the issues with ldr/str in D147136, but I think there might still be a problem with the complex pattern. @rsandifo-arm @bryanpkc I'll take a look!