This is an archive of the discontinued LLVM Phabricator instance.

[mlir][ArmSME] Extend streaming-mode pass to support enabling ZA
ClosedPublic

Authored by c-rhodes on Jun 12 2023, 4:06 AM.

Details

Summary

This patch extends the 'enable-arm-streaming' pass with a new option to
enable the ZA storage array by adding the 'arm_za' attribute to
'func.func' ops.

A later patch will insert llvm.aarch64.sme.za.enable at the beginning
of 'func.func' ops and llvm.aarch64.sme.za.disable before
func.return statements when lowering to LLVM dialect.

Currently the pass only supports enabling ZA with streaming-mode on but
the SME LDR, STR and ZERO instructions can access ZA when not in
streaming-mode (section B1.1.1, IDGNQM [1]), so it may be worth making
these options independent in the future.

N.B. This patch is generally useful in the context of SME enablement in
MLIR, but it will help enable writing an integration test for rewrite
pattern that lowers vector.transfer_write -> zero {za} (D152508).

[1] https://developer.arm.com/documentation/ddi0616/aa

Diff Detail

Event Timeline

c-rhodes created this revision.Jun 12 2023, 4:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2023, 4:06 AM
c-rhodes requested review of this revision.Jun 12 2023, 4:06 AM
awarzynski accepted this revision.Jun 12 2023, 5:35 AM

Thanks for adding this, LGTM!

mlir/lib/Dialect/ArmSME/Transforms/EnableArmStreaming.cpp
75–78

[nit] I would reword a bit to highlight that:

  • right now, ZA can only be enabled when in streaming mode (that's a "limitation" of this pass), however,
  • it is also possible use ZA when not in streaming mode (and we could add support for that later).

I am just thinking that this is clear to somebody very familiar with the spec, but perhaps it's worth clarifying for somebody with less background knowledge?

This revision is now accepted and ready to land.Jun 12 2023, 5:35 AM
c-rhodes updated this revision to Diff 530479.Jun 12 2023, 5:47 AM

Rebase and address comments.

c-rhodes marked an inline comment as done.Jun 12 2023, 5:47 AM

Thanks for adding this, LGTM!

Thanks for reviewing!

dcaballe accepted this revision.Jun 13 2023, 11:25 AM

Thanks!

c-rhodes updated this revision to Diff 531811.Jun 15 2023, 10:11 AM
c-rhodes edited the summary of this revision. (Show Details)

As I mentioned on D152694 it isn't feasible to use the aarch64_pstate_za_new attribute since the backend emits calls to the following SME support routines [1] for the lazy-save mechanism
[2]:

__arm_tpidr2_restore
__arm_tpidr2_save

These will soon be added to compiler-rt but there's currently no public implementation, and using this attribute would introduce an MLIR dependency on compiler-rt. Furthermore, this mechanism is for routines with ZA enabled calling other routines with it also enabled. We can choose not to enable ZA in the compiler when this is case.

Given this ZA has to be enabled manually with these intrinsics [3]:

llvm.aarch64.sme.za.enable
llvm.aarch64.sme.za.disable

I've updated this patch to add an arm_za attribute and in D153050 added the intrinsics, as well as patterns for inserting them during legalization to LLVM at the start and end of functions if the function has the 'arm_za' attribute.

These 2 patches could be 1 really but given the approach has changed somewhat I didn't want to update this drastically.

[1] https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#sme-support-routines
[2] https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#the-za-lazy-saving-scheme
[3] https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/AArch64/sme-toggle-pstateza.ll

awarzynski accepted this revision.Jun 16 2023, 1:23 AM

LGTM, thanks!

These 2 patches could be 1 really but given the approach has changed somewhat I didn't want to update this drastically.

IMHO, totally fine to have this as a separate patch.

This revision was landed with ongoing or failed builds.Jun 16 2023, 2:27 AM
This revision was automatically updated to reflect the committed changes.