This is an archive of the discontinued LLVM Phabricator instance.

[mlir][ArmSME] Add tests for Streaming SVE
ClosedPublic

Authored by c-rhodes on Apr 12 2023, 4:13 AM.

Details

Summary

This patch adds a couple of tests for targeting Arm Streaming SVE (SSVE)
mode, part of the Arm Scalable Matrix Extension (SME).

SSVE is enabled in the backend at the function boundary by specifying
the aarch64_pstate_sm_enabled attribute, as documented here [1]. SSVE
can be targeted from MLIR by specifying this in the passthrough
attributes [2] and compiling with

-mattr=+sme,+sve -force-streaming-compatible-sve

The passthrough will propagate to the backend where smstart/smstop
will be emitted around the call to the SSVE function.

The set of legal instructions changes in SSVE,
-force-streaming-compatible-sve avoids the use of NEON entirely and
instead lowers to (streaming-compatible) SVE. The behaviour this flag
predicates will be hooked up to the function attribute in the future
such that simply specifying this (should) lead to correct
code-generation.

Two tests are added:

  • A basic LLVMIR test verifying the attribute is passed through.
  • An integration test calling a SSVE function.

The integration test can be run with QEMU.

[1] https://llvm.org/docs/AArch64SME.html
[2] https://mlir.llvm.org/docs/Dialects/LLVM/#attribute-pass-through

Diff Detail

Event Timeline

c-rhodes created this revision.Apr 12 2023, 4:13 AM
Herald added a reviewer: ftynse. · View Herald Transcript
Herald added a reviewer: dcaballe. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
c-rhodes requested review of this revision.Apr 12 2023, 4:13 AM

@c-rhodes , thanks for working on this, this is a major milestone for Arm's SME! :)

IIUC, these tests won't work unless run via an emulator. It would be good to document what emulator is needed and to disable these tests unless ARM_EMULATOR_EXECUTABLE is also specified.

Also, whenever testing on SVE hardware, we should avoid using an emulator to run SVE integration tests. Even if SME integration tests are enabled. We probably need ARM_SME_EMULATOR_EXECUTABLE so that:

  • ARM_EMULATOR_EXECUTABLE -> run Arm SVE integration tests via an emulator (optional, doesn't affect SME/SSVE integration tests),
  • ARM_SME_EMULATOR_EXECUTABLE -> defines the emulator to use for the Arm SME integration tests (required whenever MLIR_RUN_ARM_SME_TESTS is set, doesn't affect SVE integration tests),

@DavidSpickett , it would be great if we could run these on clang-aarch64-sve-vla. Any chance QEMU 7.1.0 is available there? (or could be installed from GitHub?)

mlir/test/Integration/Dialect/Vector/CPU/ArmSME/test-ssve.mlir
3

In the spirit of https://reviews.llvm.org/D148005, could you use mlir-cpu-runner instead?

mlir/test/Target/LLVMIR/arm-ssve.mlir
11

How about testing for smstart and smstop? Given how crucial these two instructions are, I think that that would be worthwhile.

@DavidSpickett , it would be great if we could run these on clang-aarch64-sve-vla. Any chance QEMU 7.1.0 is available there? (or could be installed from GitHub?)

We're on focal Ubuntu so it's certainly not in apt but happy to install it. Please confirm that the tests actually pass on 7.1.0 and don't hit bugs in QEMU.

Also, whenever testing on SVE hardware, we should avoid using an emulator to run SVE integration tests. Even if SME integration tests are enabled. We probably need ARM_SME_EMULATOR_EXECUTABLE so that:

Instead of enumerating the extension names you could keep the option global, and we just setup a non SVE bot to run SME and SVE tests on qemu.

For example:
clang-aarch64-sve-vla sets MLIR_RUN_ARM_SVE_TESTS=ON and runs SVE tests natively
clang-aarch64-full-2stage sets MLIR_RUN_ARM_SVE_TESTS=ON, MLIR_RUN_ARM_SME_TESTS=ON, ARM_EMULATOR_EXECUTABLE=<qemu> and runs everything under emulation

The majority of people at this point have neither SVE or SME, so adding a bunch of emulator options for this only serves the bots.

Overall, these tests should probably check actual host features but that looks like a more involved change. As in, a test should say "I want SVE" and whether SVE is got from hardware or emulator, that's up to lit to resolve. But don't attempt that in this change :)

@c-rhodes , thanks for working on this, this is a major milestone for Arm's SME! :)

IIUC, these tests won't work unless run via an emulator. It would be good to document what emulator is needed and to disable these tests unless ARM_EMULATOR_EXECUTABLE is also specified.

Happy to document what's needed but by disabling these tests unless an emulator is specified are we not just creating work for ourselves in the future? If you build the Arm SME integration tests and get SIGILLs running them on unsupported hardware that's on you?

mlir/test/Integration/Dialect/Vector/CPU/ArmSME/test-ssve.mlir
3

In the spirit of https://reviews.llvm.org/D148005, could you use mlir-cpu-runner instead?

Sure I can base this on your patch.

mlir/test/Target/LLVMIR/arm-ssve.mlir
11

How about testing for smstart and smstop? Given how crucial these two instructions are, I think that that would be worthwhile.

That's tested in the backend, I don't see any value in duplicating that in MLIR.

@DavidSpickett , it would be great if we could run these on clang-aarch64-sve-vla. Any chance QEMU 7.1.0 is available there? (or could be installed from GitHub?)

We're on focal Ubuntu so it's certainly not in apt but happy to install it. Please confirm that the tests actually pass on 7.1.0 and don't hit bugs in QEMU.

Cheers! I've just tested the SVE and SME integration tests pass on 7.1.0.

Also, whenever testing on SVE hardware, we should avoid using an emulator to run SVE integration tests. Even if SME integration tests are enabled. We probably need ARM_SME_EMULATOR_EXECUTABLE so that:

Instead of enumerating the extension names you could keep the option global, and we just setup a non SVE bot to run SME and SVE tests on qemu.

For example:
clang-aarch64-sve-vla sets MLIR_RUN_ARM_SVE_TESTS=ON and runs SVE tests natively
clang-aarch64-full-2stage sets MLIR_RUN_ARM_SVE_TESTS=ON, MLIR_RUN_ARM_SME_TESTS=ON, ARM_EMULATOR_EXECUTABLE=<qemu> and runs everything under emulation

The majority of people at this point have neither SVE or SME, so adding a bunch of emulator options for this only serves the bots.

Overall, these tests should probably check actual host features but that looks like a more involved change. As in, a test should say "I want SVE" and whether SVE is got from hardware or emulator, that's up to lit to resolve. But don't attempt that in this change :)

I agree with everything you've said here :)

Cheers! I've just tested the SVE and SME integration tests pass on 7.1.0.

Thanks, I'll get this installed onto the bots.

However you want to wire up the cmake options, we can do that over in llvm-zorg.

Matt added a subscriber: Matt.Apr 12 2023, 11:27 AM

@c-rhodes , thanks for working on this, this is a major milestone for Arm's SME! :)

IIUC, these tests won't work unless run via an emulator. It would be good to document what emulator is needed and to disable these tests unless ARM_EMULATOR_EXECUTABLE is also specified.

but by disabling these tests unless an emulator is specified are we not just creating work for ourselves in the future? If you build the Arm SME integration tests and get SIGILLs running them on unsupported hardware that's on you?

I don't necessarily agree and others might see this differently too. We know that these tests do require an emulator and that this is unlikely to change in the near future. So a user getting a SIGILL could argue that a configuration _without_ an emulator shouldn't be possible/enabled at all. But whether that's on the user or the developer is secondary, IMHO. What's more important is that we can make everyone's lives a bit easier and make sure that configs that are not supported (i.e. _without_ emulation) are also not implemented. Yes, one person would have to update this config once SME hardware is available, but that's a low price to pay in my view. I am happy to volunteer for that. In any case, this is a nice-to-have, not a blocker.

mlir/test/Integration/Dialect/Vector/CPU/ArmSME/test-ssve.mlir
3

Just to clarify - you won't need to rebase this on top of D148005 to use mlir-cpu-runner (SVE and SME tests do not share their lit.local.cfg ATM, right?)

mlir/test/Target/LLVMIR/arm-ssve.mlir
11

That's tested in the backend, I don't see any value in duplicating that in MLIR.

I think that that's a bit more nuanced than this.

  1. I couldn't find a test in the backend that would specifically verify that aarch64_pstate_sm_enabled --> smstart + smstop (though there are tests that would fail if this condition no longer holds, e.g. sme-streaming-body.ll)
  2. One can argue that "LLVM IR MLIR dialect" --> "LLVM IR" translation is already thoroughly tested in MLIR, so there's no need for another test at all (i.e. "mlir/test/Target/LLVMIR/arm-ssve.mlir" is not needed).

A couple of other points:

  • To me, aarch64_pstate_sm_enabled is an LLVM implementation detail. From the point of view of "SME integration", I'm more interested in making sure that there's a mechanism in MLIR to take us from said mechanism to smstart/smstop. Yes, this sort of low level hardware details are normally a domain of LLVM backends, but it's unavoidable for integration tests to cross the boundaries of sub-projects. This would be only one test to verify the most fundamental part of the SME extension.
  • This would also be a nice way of documenting what the purpose of this attribute is (again, this information is available in various places, but it would be nice to make it obvious for folks working primarily on MLIR). Also, llc is already available in MLIR.

Admittedly, other the tests in this directory focus on LLVM IR and I am thinking about another "integration test". So this would need to go elsewhere. As for duplication - there's probably over 100k tests in llvm-project (I haven't checked) - having one extra test would have negligible impact.

Again, not a blocker, just a nice-to-have. I mostly wanted to share my rationale for the suggestion. I'm definitely in favor of keeping this test.

QEMU is now available on our AArch64 bots:

root@linaro-clang-aarch64-lld-2stage:/# qemu-aarch64 --version
qemu-aarch64 version 7.1.0
Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers

Please put up a patch for llvm-zorg to setup the bots.

mlir/test/Integration/Dialect/Vector/CPU/ArmSME/test-ssve.mlir
3

Just to clarify - you won't need to rebase this on top of D148005 to use mlir-cpu-runner (SVE and SME tests do not share their lit.local.cfg ATM, right?)

Your patch adds the mlir-cpu-runner executable to the top-level mlir/test Lit config, only a small change so I guess I could do the same change here.

mlir/test/Target/LLVMIR/arm-ssve.mlir
11
  1. I couldn't find a test in the backend that would specifically verify that aarch64_pstate_sm_enabled --> smstart + smstop (though there are tests that would fail if this condition no longer holds, e.g. sme-streaming-body.ll)

https://github.com/llvm/llvm-project/blob/e3175f7f1b55a38e9ab845ca26f0c4343ee2f7ff/llvm/test/CodeGen/AArch64/sme-streaming-interface.ll#L29

  1. One can argue that "LLVM IR MLIR dialect" --> "LLVM IR" translation is already thoroughly tested in MLIR, so there's no need for another test at all (i.e. "mlir/test/Target/LLVMIR/arm-ssve.mlir" is not needed).

That's a fair point and did cross my mind, that test is effectively testing the passthrough mechanism which is already tested so it could be removed.

A couple of other points:

  • To me, aarch64_pstate_sm_enabled is an LLVM implementation detail. From the point of view of "SME integration", I'm more interested in making sure that there's a mechanism in MLIR to take us from said mechanism to smstart/smstop. Yes, this sort of low level hardware details are normally a domain of LLVM backends, but it's unavoidable for integration tests to cross the boundaries of sub-projects. This would be only one test to verify the most fundamental part of the SME extension.
  • This would also be a nice way of documenting what the purpose of this attribute is (again, this information is available in various places, but it would be nice to make it obvious for folks working primarily on MLIR). Also, llc is already available in MLIR.

Admittedly, other the tests in this directory focus on LLVM IR and I am thinking about another "integration test". So this would need to go elsewhere. As for duplication - there's probably over 100k tests in llvm-project (I haven't checked) - having one extra test would have negligible impact.

Again, not a blocker, just a nice-to-have. I mostly wanted to share my rationale for the suggestion. I'm definitely in favor of keeping this test.

The MLIR testing guidelines don't seem to mention anything about testing code-generation and I only see a single test using llc so I'm hesitant to add such a test given there's not a strong precedent and there is coverage in the backend testing smstart/smstop are emitted for this IR.

QEMU is now available on our AArch64 bots:

root@linaro-clang-aarch64-lld-2stage:/# qemu-aarch64 --version
qemu-aarch64 version 7.1.0
Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers

Please put up a patch for llvm-zorg to setup the bots.

Thanks for installing! Will do.

c-rhodes added inline comments.Apr 13 2023, 6:04 AM
mlir/test/Integration/Dialect/Vector/CPU/ArmSME/test-ssve.mlir
3

In the spirit of https://reviews.llvm.org/D148005, could you use mlir-cpu-runner instead?

Ah, just realised mlir-cpu-runner doesn't support -force-streaming-compatible-sve flag. It will be possible to use mlir-cpu-runner once the behaviour that flag predicates is hooked up to the "aarch64_pstate_sm_enabled" function attr.

c-rhodes updated this revision to Diff 513226.Apr 13 2023, 7:37 AM

In lit config use absolute path to lli in llvm_tools_dir when running under emulation and no lli executable is specified. The top-level lit config adds llvm-tools-dir to PATH but this is lost when running under emulation such that previously -DARM_EMULATOR_LLI_EXECUTABLE=<path-to-llvm-tools-dir>/lli had to be specified when building. I think config.arm_emulator_lli_executable is only relevant when cross-compiling. This simplifies the flags in D148232.

QEMU is now available on our AArch64 bots:

root@linaro-clang-aarch64-lld-2stage:/# qemu-aarch64 --version
qemu-aarch64 version 7.1.0
Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers

Please put up a patch for llvm-zorg to setup the bots.

Done, see D148232.

awarzynski added inline comments.Apr 14 2023, 1:22 AM
mlir/test/Integration/Dialect/SparseTensor/CPU/lit.local.cfg
23–25 ↗(On Diff #513226)

this is lost when running under an emulator

Why and where? This feels odd - shouldn't we be fixing that instead of adding similar logic in 3 different places?

mlir/test/Integration/Dialect/Vector/CPU/ArmSME/test-ssve.mlir
3

Hooking -force-streaming-compatible-sve into mlir-cpu-runner would also be an option, but definitely not in this patch.

10

IIUC, the only bit that makes this test require SME is this:

{passthrough = ["aarch64_pstate_sm_enabled"]}

Just checking my understanding.

c-rhodes added inline comments.Apr 14 2023, 7:43 AM
mlir/test/Integration/Dialect/SparseTensor/CPU/lit.local.cfg
23–25 ↗(On Diff #513226)

this is lost when running under an emulator

Why and where? This feels odd - shouldn't we be fixing that instead of adding similar logic in 3 different places?

Agree the duplication isn't ideal, these configs need refactoring. When running normally I think lli is found because of this the PATH modifications here: https://github.com/llvm/llvm-project/blob/9bc5e8c87e9be8db2d65e71f90ba0ceea4b814a4/mlir/test/lit.cfg.py#L64

When running under emulation this doesn't work and using an absolute path here fixes it. I did try forwarding PATH to QEMU invocation with -E to no avail.

mlir/test/Integration/Dialect/Vector/CPU/ArmSME/test-ssve.mlir
3

Hooking -force-streaming-compatible-sve into mlir-cpu-runner would also be an option, but definitely not in this patch.

Mirroring llc flags in mlir-cpu-runner isn't particularly ideal, be nice if there were a better way

10

IIUC, the only bit that makes this test require SME is this:

{passthrough = ["aarch64_pstate_sm_enabled"]}

Just checking my understanding.

Correct, the backend will emit smstart/smstop around calls to functions with this attribute.

awarzynski accepted this revision.Apr 14 2023, 8:04 AM

LGTM, thanks for working on this!

Please wait a couple of days before merging - just in case other potential reviewers are away :) Also, I would be nice to send a short PSA about this to https://discourse.llvm.org/c/mlir/. I feel that lots of folks might be interested. Also, it's something worth celebrating 🎉 .

mlir/test/Integration/Dialect/SparseTensor/CPU/lit.local.cfg
23–25 ↗(On Diff #513226)

Cheers for the explanation. We should probably just include lli in https://github.com/llvm/llvm-project/blob/9bc5e8c87e9be8db2d65e71f90ba0ceea4b814a4/mlir/test/lit.cfg.py#L67-L88. I feel that that would help simplify quite a few things. Another patch, another time :)

mlir/test/Integration/Dialect/Vector/CPU/ArmSME/test-ssve.mlir
3

Mirroring llc flags in mlir-cpu-runner isn't particularly ideal, be nice if there were a better way

Actually, this is a global LLVM backend flag rather than something specific to llc and/or lli :) Flags like this are usually exposed in other drivers (e.g. clang and flang-new) via -mllvm. As for mlir-cpu-runner, I think that this would be no different to --mattr and --march which are also supported. In general, tools that "drive" the LLVM backend have access to these options.

This revision is now accepted and ready to land.Apr 14 2023, 8:04 AM
aartbik added inline comments.Apr 15 2023, 5:00 PM
mlir/test/Integration/Dialect/SparseTensor/CPU/lit.local.cfg
23–25 ↗(On Diff #513226)

Can you please add to TODO with name here. I am okay accepting this for now, but I am not too happy how complex the SparseTensor lit has become, so I would like to make sure this is truly fixed in some future... ;-)

c-rhodes updated this revision to Diff 514547.Apr 18 2023, 12:52 AM

Add a TODO to capture refactoring

c-rhodes marked an inline comment as done.Apr 18 2023, 12:52 AM
c-rhodes added inline comments.Apr 21 2023, 1:03 AM
mlir/test/Integration/Dialect/SparseTensor/CPU/lit.local.cfg
23–25 ↗(On Diff #513226)

Can you please add to TODO with name here. I am okay accepting this for now, but I am not too happy how complex the SparseTensor lit has become, so I would like to make sure this is truly fixed in some future... ;-)

Hi Aart, I've added the TODO, are you happy with this now?

c-rhodes added inline comments.Apr 21 2023, 8:34 AM
mlir/test/Integration/Dialect/SparseTensor/CPU/lit.local.cfg
23–25 ↗(On Diff #513226)

Can you please add to TODO with name here. I am okay accepting this for now, but I am not too happy how complex the SparseTensor lit has become, so I would like to make sure this is truly fixed in some future... ;-)

Hi Aart, I've added the TODO, are you happy with this now?

This refactoring turned out to be a lot more involved than I anticipated, but I've posted a follow up D148929 as a potential fix. Let us know what you think. Cheers.

aartbik accepted this revision.Apr 24 2023, 12:21 PM

Yes, thanks for adding the TODO!

This revision was automatically updated to reflect the committed changes.