This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SME] Add codegen pass to handle ZA state in arm_new_za functions.
ClosedPublic

Authored by david-arm on Sep 14 2022, 2:20 PM.

Details

Summary

The new pass implements the following:

  • Inserts code at the start of an arm_new_za function to commit a lazy-save when the lazy-save mechanism is active.
  • Adds a smstart intrinsic at the start of the function.
  • Adds a smstop intrinsic at the end of the function.

Patch co-authored by kmclaughlin.

Diff Detail

Build Status
Buildable 188191

Event Timeline

sdesmalen created this revision.Sep 14 2022, 2:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 14 2022, 2:20 PM
sdesmalen requested review of this revision.Sep 14 2022, 2:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 14 2022, 2:20 PM
Matt added a subscriber: Matt.Sep 16 2022, 11:43 AM

Using an IR pass to implement ABI doesn't seem right to me. This could be done at codegen time, and having it at the IR level places implicit constraints on later IR passes that aren't enforced.

Even the IR pass was the right choice, I don't see anything here that necessitates a Module pass instead of the usual FunctionPass.

shruthiashwath added inline comments.Sep 20 2022, 7:37 AM
llvm/lib/Target/AArch64/SMEABIPass.cpp
138

I am not sure if I did look properly but I could not find this attribute in the documents, can you please help me understand this attribute "aarch64_expanded_pstate_za", or redirect me to the document

Using an IR pass to implement ABI doesn't seem right to me. This could be done at codegen time, and having it at the IR level places implicit constraints on later IR passes that aren't enforced.

Even the IR pass was the right choice, I don't see anything here that necessitates a Module pass instead of the usual FunctionPass.

Hi @aemerson, I think the reasons why we chose to do this in IR are:

  1. Partly due to legacy reasons, since for an earlier version of the ABI it made sense to implement aspects of the ABI in an IR pass. However, I guess there is now a less compelling reason to do so.
  2. I think it's probably neater to do this in IR since we can do both the function prologue and epilogue at the same time, otherwise in codegen I imagine we have to do it separately in different lowering functions? In an IR pass we can deal with all aspects of the attributes at the same time in the same function. IIRC it's easier to create/split blocks in IR too?

I can look into how much effort it would take to implement this in codegen, however I'm not sure what you mean by having it at the IR level places implicit constraints on later IR passes that aren't enforced? It's worth saying that we will need an SME ABI IR pass like when it comes to exception handling because that's too complicated to do at the codegen level.

Using an IR pass to implement ABI doesn't seem right to me. This could be done at codegen time, and having it at the IR level places implicit constraints on later IR passes that aren't enforced.

Even the IR pass was the right choice, I don't see anything here that necessitates a Module pass instead of the usual FunctionPass.

Hi @aemerson, I think the reasons why we chose to do this in IR are:

  1. Partly due to legacy reasons, since for an earlier version of the ABI it made sense to implement aspects of the ABI in an IR pass. However, I guess there is now a less compelling reason to do so.
  2. I think it's probably neater to do this in IR since we can do both the function prologue and epilogue at the same time, otherwise in codegen I imagine we have to do it separately in different lowering functions? In an IR pass we can deal with all aspects of the attributes at the same time in the same function. IIRC it's easier to create/split blocks in IR too?

I can look into how much effort it would take to implement this in codegen, however I'm not sure what you mean by having it at the IR level places implicit constraints on later IR passes that aren't enforced? It's worth saying that we will need an SME ABI IR pass like when it comes to exception handling because that's too complicated to do at the codegen level.

My concern was that by decorating the IR for ABI reasons we also rely on any future IR passes to not accidentally break it by placing instructions in the wrong place. Having said that, I suppose you technically have the same issue if you do it in a MachineFunction pass.

So in that respect my objection to an IR pass is gone, but I think this should be moved as late as possible in the pipeline before codegen runs. The current place it runs seems to be before a bunch of other passes.

Hi @aemerson and @shruthiashwath, while @sdesmalen is away I'm trying to deal with comments on his behalf. However, Phabricator doesn't let me update his patch, and instead forked off a new one here - D134437. Could you leave any comments on D134437 instead of this one from now on? Thanks!

david-arm commandeered this revision.Sep 22 2022, 9:06 AM
david-arm updated this revision to Diff 462196.
david-arm edited reviewers, added: sdesmalen; removed: david-arm.
  • Commandeered patch as per @aemerson's suggestion!
shruthiashwath added inline comments.Sep 22 2022, 7:18 PM
llvm/docs/AArch64SME.rst
432

How do you plan to use this further?
As the pass already handles arm_new_za, is there any specific functionality that needs to handled after the SelectionDAG ?

david-arm marked an inline comment as done.Sep 23 2022, 1:21 AM
david-arm added inline comments.
llvm/docs/AArch64SME.rst
432

This attribute is added purely as a marker to say the function has been processed. We did this mainly as a precaution in case for some reason the SME ABI pass got invoked twice in a row. Perhaps it's being overly cautious. :)

bryanpkc added inline comments.
llvm/lib/Target/AArch64/SMEABIPass.cpp
1

NIT: There should be one fewer space between "SME" and "ABI", and there should be one space after "ABI".

aemerson accepted this revision.Oct 4 2022, 6:56 AM

LGTM.

This revision is now accepted and ready to land.Oct 4 2022, 6:56 AM
c-rhodes added inline comments.
llvm/test/CodeGen/AArch64/sme-new-za-function.ll
3

Is there a reason for specifying the pass twice?

sdesmalen added inline comments.May 15 2023, 6:16 AM
llvm/test/CodeGen/AArch64/sme-new-za-function.ll
3

We added it to test that running the pass more than once leads to the same IR.

c-rhodes added inline comments.May 15 2023, 7:08 AM
llvm/test/CodeGen/AArch64/sme-new-za-function.ll
3

We added it to test that running the pass more than once leads to the same IR.

Ah ok! Thanks for clarifying, thought it might have been a mistake.