This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse][ArmSVE] Add sparse integration tests for ArmSVE
ClosedPublic

Authored by awarzynski on Mar 9 2022, 9:17 AM.

Details

Summary

LLVM backend for AArch64 does not currently support product reductions
so it requires some test code duplication to have a version for the
addition reductions. For all the other tests, we can run the vanilla
version with VLA compilation options.

Diff Detail

Event Timeline

jsetoain created this revision.Mar 9 2022, 9:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2022, 9:17 AM
jsetoain updated this revision to Diff 425205.Apr 26 2022, 7:10 AM

Include scalable vector tests with the others

jsetoain added inline comments.Apr 27 2022, 2:51 AM
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_reductions.mlir
10–29

I need to remove this

jsetoain updated this revision to Diff 430683.May 19 2022, 7:58 AM

Add partial reduction tests for VLA testing

jsetoain updated this revision to Diff 437810.Jun 17 2022, 12:21 AM

Fix broken test. All VLA sparse tests have been verified.

jsetoain published this revision for review.Jun 17 2022, 12:24 AM
aartbik added inline comments.Jun 17 2022, 12:57 PM
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_reductions_vla.mlir
2 ↗(On Diff #437810)

For the fully duplicated test (new files), would it make sense to put them in

mlir/test/Integration/Dialect/SparseTensor/CPU/ArmSVE

Similar to what we did for

mlir/test/Integration/Dialect/Vector/CPU/ArmSVE

jsetoain updated this revision to Diff 438398.Jun 20 2022, 7:50 AM

Move test to ArmSVE-specific directory

Hi @jsetoain , thanks very much for this patch! I know that you have no access to SVE hardware ATM, so it might be tricky for you to finish this. And we (Arm) would love to have this merged :) Shall I take over? I do have access to the required testing infra.

Btw, these tests run mostly fine for me, but lli behaves slightly differently to mlir-cpu-runner - I had to add return 0 for the tests to pass. So, why not just use mlir-cpu-runner instead? I suspect that that's something to do with SVE support?

Hi @jsetoain , thanks very much for this patch! I know that you have no access to SVE hardware ATM, so it might be tricky for you to finish this. And we (Arm) would love to have this merged :) Shall I take over? I do have access to the required testing infra.

Btw, these tests run mostly fine for me, but lli behaves slightly differently to mlir-cpu-runner - I had to add return 0 for the tests to pass. So, why not just use mlir-cpu-runner instead? I suspect that that's something to do with SVE support?

You can rent SVE hardware at Amazon:
https://aws.amazon.com/ec2/instance-types/c7g/

Hi @jsetoain , thanks very much for this patch! I know that you have no access to SVE hardware ATM, so it might be tricky for you to finish this. And we (Arm) would love to have this merged :) Shall I take over? I do have access to the required testing infra.

Btw, these tests run mostly fine for me, but lli behaves slightly differently to mlir-cpu-runner - I had to add return 0 for the tests to pass. So, why not just use mlir-cpu-runner instead? I suspect that that's something to do with SVE support?

Hi Andrzej! By all means, please, feel free to take over. I'm trying to figure out a way to transfer the patch to you. The reason why I'm using lli here is because mlir-cpu-runner was not working. I don't remember the issue, exactly, and it might have been fixed by now. If you manage to get the test running with mlir-cpu-runner, that's definitely preferable to lli.

Thanks! 😊

awarzynski commandeered this revision.Oct 16 2022, 9:49 AM
awarzynski added a reviewer: jsetoain.

I'm trying to figure out a way to transfer the patch to you.

Done :)

Matt added a subscriber: Matt.Oct 19 2022, 5:18 AM

@aartbik Given https://reviews.llvm.org/D136183, we should probably park this for now?

CC @jsetoain

awarzynski abandoned this revision.Oct 25 2022, 3:03 PM

Abandoning this for now. Let's wait for updates on the sparse compiler before we decide what to do next: https://discourse.llvm.org/t/mlir-sparse-compiler-progress/60479/8.

Abandoning this for now. Let's wait for updates on the sparse compiler before we decide what to do next: https://discourse.llvm.org/t/mlir-sparse-compiler-progress/60479/8.

We will soon be back online with a much better SIMD approach!
Apologies for the slight detour in the meantime.

awarzynski reclaimed this revision.Nov 22 2022, 12:21 PM

With https://reviews.llvm.org/D138236 in-tree we can re-open this.

Rebased on top of main.

I switched from lli to mlir-cpu-runner as that was the path of least resitance for now. However, there's no -march and -mattr in mlir-cpu-runner. I need to double check that indeed SVE code is generated and run. If not, we either need to update mlir-cpu-runner or I need to use lli instead.

aartbik added inline comments.Nov 30 2022, 1:16 PM
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_flatten.mlir
34

Just a very quick comment, this will not be the right way to run the sparse vectorizer.

We will have to pass in the flags to the pipeline command sparse-compiler (see original vector example), but adding this support is still TBD. I was waiting for the vectorizer to be "production" ready before adding those ;-)

awarzynski added inline comments.Dec 1 2022, 7:28 AM
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_flatten.mlir
34

Thanks @aartbik ! Yeah, I jumped the gun here a bit :)

Revision https://reviews.llvm.org/D139581 should enable you to restart the efforts again!

Rebase on top of main

@aartbik Could you take a quick and let me know whether this makes sense? 3
tests are currently failing for me:

Failed Tests (3):
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_quantized_matmul.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sampled_matmul.mlir
  MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sampled_mm_fusion.mlir

I didn't get a chance to triage just yet.

Mostly just rebase on top of main

Can you please rebase until at least https://reviews.llvm.org/D139983 and try again.

Also, please note you may need enable-buffer-initialization=true if you use vector_transfer/print to CHECK the output if you use codegen instead of lib.

Rebase on top of main

I've also simplified my changes a bit to make them less intrusive.

aartbik added inline comments.Dec 15 2022, 10:46 AM
mlir/test/Integration/Dialect/SparseTensor/CPU/ArmSVE/lit.local.cfg
20 ↗(On Diff #483214)

period at end

mlir/test/Integration/Dialect/SparseTensor/CPU/ArmSVE/sparse_reductions_vla.mlir
1 ↗(On Diff #483214)

refresh my memory, why do we need the ArmSVE specific directory?
It seems that most test fit well at the same level, so we do no need this test and the lit local file?

mlir/test/Integration/Dialect/SparseTensor/CPU/lit.local.cfg
21

period at end

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_cast.mlir
20

Phrase this a bit in the same style as the others,

If SVE is available, do the same run, but now with direct IR generation and VLA vectorization.

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_filter_conv2d.mlir
119

Do you see a way to avoid adding this to *all* tests? Can we put this in its own file and let the command line compile take care of it?

aartbik added inline comments.Dec 27 2022, 9:52 AM
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_filter_conv2d.mlir
119

Here is an idea, add a new file entry.mlir with just this entry method, and then do

RUN: cat $s entry.mlir | rest of pipeline

so that the two files are concatenated into a single stream

Address PR comments

Thanks for the comments @aartbik :)

  • Moved entry_lli to a dedicated file (avoids code duplication)
  • Unified tests to use direct IR generation
  • Updated comments

refresh my memory, why do we need the ArmSVE specific directory?

Let me get back to you tomorrow - first I need to go over the discussion and refresh my own memory.

awarzynski added inline comments.Dec 27 2022, 11:39 AM
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_filter_conv2d.mlir
119

Thanks! I've derived a slightly different solution - see the latest update. it turns out lli will happily consume multiple files. Let me know whether you have any preference - I'm happy to try your suggestion instead.

aartbik added inline comments.Dec 27 2022, 1:16 PM
mlir/test/Integration/Dialect/SparseTensor/CPU/Inputs/main_for_lli.ll
2

I am okay keeping this at the same level. No need to introduce a whole new dir for just this

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_filter_conv2d.mlir
116

please make sure to avoid unrelated diffs

119

even better

awarzynski added inline comments.Dec 27 2022, 2:50 PM
mlir/test/Integration/Dialect/SparseTensor/CPU/Inputs/main_for_lli.ll
2

We need to make sure that this file is not considered as a yet another test file. By putting it in "Inputs", we don't require any additional logic for this - see the definition of config.excludes.

I've not been able to find any documentation for this, but I've seen it used in various other sub-projects.

awarzynski added inline comments.Dec 28 2022, 8:59 AM
mlir/test/Integration/Dialect/SparseTensor/CPU/ArmSVE/sparse_reductions_vla.mlir
1 ↗(On Diff #483214)

The answer is in the original commit summary:

LLVM backend for AArch64 does not currently support product reductions so it requires some test code duplication to have a version for the addition reductions.

I did check the backend for a clear indication of what is currently supported and what is not, but it's a bit convoluted:

I will reach out internally to learn a bit more about the status of SVE in the context of reductions, but need to wait for folks to return from their Xmas breaks first.

I do think that we should avoid code duplication though and creating a new sub-directory here. Instead, I suggest splitting "sparse_reductions.mlir" into 2 files:

  • one that tests reductions supported by SVE
  • another one that contains the remaining reductions.

Let me upload this so that you get a better idea.

Split "sparse_reductions.mlir" into 2 test files

I split "sparse_reductions.mlir" into:

  • sparse_reductions_1.mlir (reductions supported by SVE)
  • sparse_reductions_2.mlir (all other reductions)

This way we can avoid adding "ArmSVE/sparse_reductions_vla.mlir", which
duplicated SVE-supported reductions from "sparse_reductions.mlir".

Remove the remaining unrelated changes

Also made sure that all SVE tests use direct IR generation

Yeah, this is LGTM after making sure the emulator works and addressing my last nit request.

mlir/test/Integration/Dialect/SparseTensor/CPU/Inputs/main_for_lli.ll
2

Ah, okay, makes sense.

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_reductions_2.mlir
1 ↗(On Diff #485704)

I like this solution of splitting up the files much better than having yet another directory. Thanks!

But one last nit, I don't really like the _1 _2 naming, as it does not add much. Perhaps we can some up with a better name, like just sparse_reductions.mlir for the original and sparse_reductions_prod.mlir for this file, so it becomes more descriptive?

Yeah, this is LGTM after making sure the emulator works and addressing my last nit request.

I've experimented with both QEMU and ArmIE. While QEMU works, it crashes at the end of a simulation when running lli. I've tried to reduce my issue, but I can only reproduce it with lli (which is a relatively huge binary). As I haven't been able to find any evidence that QEMU actually ever worked here (end-to-end), I suggest leaving it for now. lli is rather too large as a reproducer. Any suggestions how to reduce this or where to report it?

Instead, I used ArmIE. It worked without any issues (I needed to replace lli with %lli in tests). It's a free emulator and the official instructions worked for me just fine. So I am hoping that this is sufficient here. I used the following CMake set-up:

cmake -DMLIR_INCLUDE_INTEGRATION_TESTS=True -DMLIR_RUN_ARM_SVE_TESTS=True -DARM_EMULATOR_OPTIONS="-msve-vector-bits=128" -DARM_EMULATOR_EXECUTABLE="armie" <other cmake options>

I'm just about to send a small update that will address the outstanding comment - rename "sparse_reductions_{1|2}.mlir". So, is this ready to be merged? :)

Replace lli with %lli to fix substition, rename test files

I'm just about to send a small update that will address the outstanding comment - rename "sparse_reductions_{1|2}.mlir". So, is this ready to be merged? :)

Aart is currently on vacation and will be back after Jan 20th. I am in his team and I would suggest that we wait until he is back because I am afraid that I do not have enough context/knowledge to make the decision by myself for this patch ;-)

aartbik accepted this revision.Jan 23 2023, 1:17 PM
This revision is now accepted and ready to land.Jan 23 2023, 1:17 PM