This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SME] Enable TPIDR2 lazy-save for za_preserved
ClosedPublic

Authored by MattDevereau on Aug 30 2023, 4:24 AM.

Details

Summary

This change makes callees with the arm_preserves_za
type attribute comply with the dormant state requirements
when it's caller has the
arm_shared_za type attribute

https://github.com/ARM-software/abi-aa/blob/5e67092434b50c04f8ad178a9c272ce3c6ada7fd/aapcs64/aapcs64.rst?plain=1#L1381

Diff Detail

Event Timeline

MattDevereau created this revision.Aug 30 2023, 4:24 AM
MattDevereau requested review of this revision.Aug 30 2023, 4:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 30 2023, 4:24 AM
sdesmalen added inline comments.Aug 30 2023, 6:52 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7340

requiresLazySave should return true, even when the function has the aarch64_pstate_za_preserved attribute. That also makes this code below a bit simpler.

7352

Streaming-properties are orthogonal to ZA, so don't belong in this condition.

llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.h
81

Could you keep the original name please?

MattDevereau added inline comments.Aug 30 2023, 7:26 AM
llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.h
81

Can I do it as a separate change at least? It doesn't match the rest of the class's interface at all.

MattDevereau added inline comments.Aug 30 2023, 7:33 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7352

It's possible for

if (RequiresSMChange)
  PStateSM = getPStateSM(DAG, Chain, CallerAttrs, DL, MVT::i64);

to call LowerCall an additional time and generate an extra redundant store into the TPIDR2 save slices parameter, which is why this is here. I will look for a better way of handling this though.

sdesmalen added inline comments.Aug 31 2023, 12:14 AM
llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.h
81

The preserving ZA is not really an interface in the sense that the ABI defines e.g. a SharedZA/PrivateZA/Streaming interface. The 'interface' part of the name refers to the caller having to be aware of the callee's interface when generating code for the call.

For example, a private-ZA interface requires the caller to set up the lazy-save mechanism when the caller has ZA state.
Or a streaming interface requires the caller to ensure that streaming-SVE mode is enabled before the call.

The __arm_preserves_za attribute (or aarch64_pstate_za_preserved as it's named in LLVM IR) is more of a 'hint' to the compiler to generate more efficient code. It is a property that can be ignored by the compiler and the generated code would still be correct.

Note that I pushed a change yesterday to rename hasNewZAInterface() in rG0a32a999aea07e373e1b3c9e6f0d20d3f7f1a7b8, because 'new ZA' is not part of the interface (externally visible to callers), but rather an implementation property of the function's body/definition.

As such, I'd rather stick with preservesZA as the name of this interface.

Matt added a subscriber: Matt.Aug 31 2023, 10:30 AM
sdesmalen added inline comments.Sep 6 2023, 3:22 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7328

Am I reading this wrong, or should this have been named NumZaSaveSlicesAddr ?

7753–7789

Can you write this as:

if (RequiresLazySave) {
  if (!CalleeAttrs.preservesZA()) {
    // code to restore ZA
  }
  Result = DAG.getNode(...aarch64_sme_set_tpidr2...);
}
llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.h
38

nit: maybe add a comment that we use this for SME ABI routines.

85

nit: this interface seems redundant.

MattDevereau marked 4 inline comments as done.
sdesmalen added inline comments.Sep 6 2023, 6:50 AM
llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.cpp
37

Perhaps it's worth adding a new SMEAttrs constructor, such that you can use that both here and in AArch64ISelLowering.cpp

SMEAttrs(StringRef funcname) {  /* handle SME ABI functions here */ }
MattDevereau added inline comments.Sep 6 2023, 7:49 AM
llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.cpp
37

What benefit is this supposed to give? If you just use a StringRef param in a constructor you will have to make many assumptions to satisfy the mutex asserts in set and probably just end up doing Bitmask |= Za_NoLazySave with whatever comes out of that constructor anyway. Isn't getting this functionality from AArch64ISelLowering.cpp is what getCalleeAttrsFromExternalFunction is providing anyway?

sdesmalen added inline comments.Sep 6 2023, 11:32 PM
llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.cpp
37

My suggestion is to avoid having to write this if (FuncName == "__arm_tpidr2_save" || .. code in multiple places.

you will have to make many assumptions to satisfy the mutex asserts in set and probably just end up doing Bitmask |= Za_NoLazySave with whatever comes out of that constructor anyway

What assumptions will you have to make?

MattDevereau added inline comments.Sep 7 2023, 1:51 AM
llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.cpp
37

In the end I realised what you meant and I uploaded another patch set with the StringRef constructor

sdesmalen added inline comments.Sep 12 2023, 12:39 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7309

Do we really need the if-statement? For example, can we just write:

else if (auto *ES = ...)
  CalleeAttrs = SMEAttrs(ES->getSymbol());

If the default for SMEAttrs(<symbol>) is that it doesn't have any streaming- or ZA properties then this is equivalent to what we had before this patch

llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.cpp
26–30

Does this need a new assert to say that NewZA and SharedZA cannot be combined with NoLazySave?

llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.h
90

I'd rather we don't add an operator bool(), because it's not clear to me what it is supposed to represent that the existing interfaces don't represent already.

MattDevereau marked 3 inline comments as done.
sdesmalen added inline comments.Sep 13 2023, 8:24 AM
llvm/test/CodeGen/AArch64/sme-lazy-save-call.ll
172

Could you also add the "aarch64_pstate_sm_compatible" attribute so that the compiler must call __arm_sme_state() as part of the call-sequence and test that we don't set up the lazy-save mechanism for that __arm_sme_state() call?

nit: Could you maybe also give this test a better name than foo? :)

MattDevereau marked an inline comment as done.
MattDevereau added inline comments.
llvm/test/CodeGen/AArch64/sme-lazy-save-call.ll
172

Done & nit done.

MattDevereau marked an inline comment as done.Sep 15 2023, 5:48 AM
sdesmalen accepted this revision.Sep 19 2023, 2:57 AM

LGTM

llvm/test/CodeGen/AArch64/sme-lazy-save-call.ll
216

nit: can you add "aarch64_pstate_za_preserved" next to the declaration of @private_za_preserved_callee similar to how you've done it for @za_shared_caller_za_preserved_callee, and then remove this attribute?

This revision is now accepted and ready to land.Sep 19 2023, 2:57 AM
This revision was landed with ongoing or failed builds.Sep 20 2023, 6:35 AM
This revision was automatically updated to reflect the committed changes.