This is an archive of the discontinued LLVM Phabricator instance.

[mlir][ArmSME] Insert intrinsics to enable/disable ZA
ClosedPublic

Authored by c-rhodes on Jun 15 2023, 10:00 AM.

Details

Summary

This patch adds two LLVM intrinsics to the ArmSME dialect:

  • llvm.aarch64.sme.za.enable
  • llvm.aarch64.sme.za.disable

for enabling the ZA storage array [1], as well as patterns for inserting
them during legalization to LLVM at the start and end of functions if
the function has the 'arm_za' attribute (D152695).

In the future ZA should probably be automatically enabled/disabled when
lowering from vector to SME, but this should be sufficient for now at
least until we have patterns lowering to SME instructions that use ZA.

N.B. The backend function attribute 'aarch64_pstate_za_new' can be used
manage ZA state (as was originally tried in D152694), but it emits calls
to the following SME support routines [2] for the lazy-save mechanism
[3]:

  • __arm_tpidr2_restore
  • __arm_tpidr2_save

These will soon be added to compiler-rt but there's currently no public
implementation, and using this attribute would introduce an MLIR
dependency on compiler-rt. Furthermore, this mechanism is for routines
with ZA enabled calling other routines with it also enabled. We can
choose not to enable ZA in the compiler when this is case.

Depends on D152695

[1] https://developer.arm.com/documentation/ddi0616/aa
[2] https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#sme-support-routines
[3] https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#the-za-lazy-saving-scheme

Diff Detail

Event Timeline

c-rhodes created this revision.Jun 15 2023, 10:00 AM
Herald added a reviewer: ftynse. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
c-rhodes requested review of this revision.Jun 15 2023, 10:00 AM
dcaballe accepted this revision.Jun 15 2023, 9:54 PM

Thanks!

This revision is now accepted and ready to land.Jun 15 2023, 9:54 PM

Thanks Cullen, just a couple of minor suggestions.

mlir/lib/Dialect/ArmSME/Transforms/LegalizeForLLVMExport.cpp
58–59

This comment suggests that target.addDynamicallyLegalOp() would enable ZA, but that's not quite true, is it? Instead, it marks funcOp as dynamically legal and provides a custom legality check:

  • if arm_za attribute is present then legal if the first function Op is arm_sme::aarch64_sme_za_enable,
  • if arm_za is not present, mark as legal.

I guess that the comment refers to the overall legalisation process rather then specifically to this hook?

mlir/test/Dialect/ArmSME/enable-arm-za.mlir
2

Could you add another RUN line without enable-za and verify that there's no asm_sme.intr.za.{enable|disable}?

It would also be good to test without the -enable-arm-streaming pass.

c-rhodes updated this revision to Diff 532048.Jun 16 2023, 2:01 AM

Address comments

c-rhodes marked 2 inline comments as done.Jun 16 2023, 2:07 AM
c-rhodes added inline comments.
mlir/lib/Dialect/ArmSME/Transforms/LegalizeForLLVMExport.cpp
58–59

This comment suggests that target.addDynamicallyLegalOp() would enable ZA, but that's not quite true, is it? Instead, it marks funcOp as dynamically legal and provides a custom legality check:

  • if arm_za attribute is present then legal if the first function Op is arm_sme::aarch64_sme_za_enable,
  • if arm_za is not present, mark as legal.

I guess that the comment refers to the overall legalisation process rather then specifically to this hook?

Good point I've updated the comment hopefully makes more sense, cheers

mlir/test/Dialect/ArmSME/enable-arm-za.mlir
2

Could you add another RUN line without enable-za and verify that there's no asm_sme.intr.za.{enable|disable}?

Done

It would also be good to test without the -enable-arm-streaming pass.

I misunderstood what you mean here somehow, will update again to add this 😄

c-rhodes updated this revision to Diff 532052.Jun 16 2023, 2:11 AM
c-rhodes marked 2 inline comments as done.

Add test checking intrinsics aren't emitted when -enable-arm-streaming pass isn't specified.

awarzynski accepted this revision.Jun 16 2023, 2:24 AM

Cheers for addressing my comments, LGTM!

mlir/test/Dialect/ArmSME/enable-arm-za.mlir
1–3

Personal opinion, feel free to ignore.

  1. Custom prefix everywhere - this way you use the prefixes to "document" the test cases (e.g. "ENABLE-ZA" clearly suggests that in this case ZA is explicitly enabled)
  2. Drop CHECK from custom prefixes. I think that most people know that these are CHECK ... prefixes. And shorter prefix means less noise/repetition.
  3. Match the order of checks with the RUN lines (so, right now, it would be CHECK, then CHECK-ENABLE-ZA and CHECK-NO-ARM-STREAMING).

Like I said, person opinion, aka nit :)

awarzynski added inline comments.Jun 16 2023, 2:26 AM
mlir/test/Dialect/ArmSME/enable-arm-za.mlir
2

I misunderstood what you mean here somehow, will update again to add this 😄

Ah, I meant a test in which the input already contains arm_za. Added by hand rather then a compiler :)

This revision was landed with ongoing or failed builds.Jun 16 2023, 2:46 AM
This revision was automatically updated to reflect the committed changes.
c-rhodes marked an inline comment as done.
c-rhodes added inline comments.Jun 16 2023, 2:46 AM
mlir/test/Dialect/ArmSME/enable-arm-za.mlir
1–3

Personal opinion, feel free to ignore.

  1. Custom prefix everywhere - this way you use the prefixes to "document" the test cases (e.g. "ENABLE-ZA" clearly suggests that in this case ZA is explicitly enabled)
  2. Drop CHECK from custom prefixes. I think that most people know that these are CHECK ... prefixes. And shorter prefix means less noise/repetition.
  3. Match the order of checks with the RUN lines (so, right now, it would be CHECK, then CHECK-ENABLE-ZA and CHECK-NO-ARM-STREAMING).

Like I said, person opinion, aka nit :)

Done, cheers