This is an archive of the discontinued LLVM Phabricator instance.

[mlir][ArmSME] Add conversion from ArmSME to SCF to materialize loops
ClosedPublic

Authored by c-rhodes on Jul 27 2023, 11:02 AM.

Details

Summary

Currently a loop is materialized when lowering ArmSME loads and stores
to intrinsics. This patch introduces two new ops to the ArmSME dialect
that map 1-1 with intrinsics:

  1. arm_sme.load_tile_slice - Loads a 1D tile slice from memory into a 2D SME "virtual tile".
  2. arm_sme.store_tile_slice - Stores a 1D tile slice from a 2D SME "virtual tile" into memory.

As well as a new conversion pass '-convert-arm-sme-to-scf' that
materializes loops with these ops. The existing load/store lowering to
intrinsics is updated to use these ops.

Depends on D156517

Discourse thread:
https://discourse.llvm.org/t/loop-materialization-in-armsme/72354

Diff Detail

Event Timeline

c-rhodes created this revision.Jul 27 2023, 11:02 AM
Herald added a reviewer: ftynse. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
c-rhodes requested review of this revision.Jul 27 2023, 11:02 AM
c-rhodes edited the summary of this revision. (Show Details)Jul 27 2023, 11:07 AM
Matt added a subscriber: Matt.Jul 27 2023, 11:12 AM
WanderAway added inline comments.Jul 28 2023, 11:06 AM
mlir/include/mlir/Dialect/ArmSME/IR/ArmSME.td
311

I feel like the name is a bit confusing, although I'm not sure I can come up with something better...
LoadTileSlice? LoadVectorToTile?

Also I wonder if it makes sense for this to have a horizontal/vertical load variant/argument, so that it would be possible to do in-flight transpose as well?

dcaballe accepted this revision.Jul 28 2023, 11:13 AM

LGTM! Just a few comments. Thanks for addressing this so quickly!

mlir/include/mlir/Dialect/ArmSME/IR/ArmSME.td
317–318

I don't follow here. The tile slice (1D) is defined by the 2D type? Do you mean by the dimension of the 2D type pointed by the index?

322

-> ~The slice of memory read is defined...?

329

This looks like a vector load with a passthru value so I would remove the _and_update part to make it shorter but up to you :)

mlir/lib/Conversion/ArmSMEToSCF/ArmSMEToSCF.cpp
85

Add a builder guard at the beginning of the function to restore the insertion point

149

same

This revision is now accepted and ready to land.Jul 28 2023, 11:13 AM
c-rhodes updated this revision to Diff 545630.Jul 31 2023, 6:35 AM
c-rhodes edited the summary of this revision. (Show Details)

Address comments

c-rhodes marked 6 inline comments as done.Jul 31 2023, 6:49 AM
c-rhodes added inline comments.
mlir/include/mlir/Dialect/ArmSME/IR/ArmSME.td
311

I feel like the name is a bit confusing, although I'm not sure I can come up with something better...
LoadTileSlice? LoadVectorToTile?

Also I wonder if it makes sense for this to have a horizontal/vertical load variant/argument, so that it would be possible to do in-flight transpose as well?

It was rather verbose but until Diego pointed out the passthru on other vectors ops I wasn't aware of any ops that worked similarly, so wanted to be explicit about what this is doing. I've renamed it to 'load_tile_slice' as suggested, I think that reads better that 'tile_slice_load'.

Also I wonder if it makes sense for this to have a horizontal/vertical load variant/argument, so that it would be possible to do in-flight transpose as well?

I think at some point it probably will but I've given little consideration to that so far and have gone with the assumption that everything is horizontal. I personally dont want to add it prematurely without having an understanding of how it fits into the big picture. Have you given much thought to how it would be used?

317–318

I don't follow here. The tile slice (1D) is defined by the 2D type? Do you mean by the dimension of the 2D type pointed by the index?

Apologies it's quite difficult writing generic descriptions for these ops. I think what I wanted to capture was e.g. for a tile of type vector<[4]x[4]xi32> the tile slice is vector<[4]xi32>, but I suppose the examples provide clarity there. I would like at some point to update the types in the asm to something like vector<[4]xi32> from memref<?x?xi8> into vector<[4]x[4]xi32>. Anyhow, I've updated the description, hopefully makes more sense.

awarzynski accepted this revision.Jul 31 2023, 7:52 AM

LGTM, great work, thanks!

mlir/include/mlir/Dialect/ArmSME/IR/ArmSME.td
322–323

[nit] Perhaps this is just me and also English is not my first language, but this reads a bit like:

The memref must be either "rank 1" or "rank 2 with dynamic dimensions"

i.e. as if "dynamic dimensions" only referred to "rank 2".

355

Given that type($tile) will always be equal to type($result), I wonder whether this wouldn't be cleaner:

let assemblyFormat = [{
    $base `[` $indices `]` `,` $tile `,` $tile_slice_index
      attr-dict `:` type($base) `,` type($tile)
  }];

or (with index type):

let assemblyFormat = [{
    $base `[` $indices `]` `,` $tile `,` $tile_slice_index
      attr-dict `:` type($base) `,` type($tile) `,` type($tile_slice_index)
  }]
mlir/lib/Conversion/ArmSMEToSCF/ArmSMEToSCF.cpp
72

[nit] "Init" in tileInit suggests that something might be initialised. Perhaps tileIdAsVector?

90

Can you expand?

102–103

Is this 2nd cast really needed? Wouldn't this work:

rewriter.replaceOp(tileLoadOp, tileInit);

? I'm probably missing something obvious 🤔 .

mlir/test/Dialect/ArmSME/invalid.mlir
77–81 ↗(On Diff #545630)

Similar test for arm_sme.store_tile_slice? If we preserve this assembly format.

c-rhodes updated this revision to Diff 545722.Jul 31 2023, 9:59 AM
c-rhodes marked 2 inline comments as done.

Address comments

WanderAway accepted this revision.Jul 31 2023, 10:08 AM

Have you given much thought to how it would be used?

Not too much at the current moment, only thing I have in mind so far is using it for transposition during matmul, since in order to do outer product, the A matrix slices would have to be a column vector, and B matrix slices a row vector. Also because gather load is not available during streaming mode.

c-rhodes marked 4 inline comments as done.Jul 31 2023, 10:14 AM
c-rhodes added inline comments.
mlir/include/mlir/Dialect/ArmSME/IR/ArmSME.td
322–323

[nit] Perhaps this is just me and also English is not my first language, but this reads a bit like:

The memref must be either "rank 1" or "rank 2 with dynamic dimensions"

i.e. as if "dynamic dimensions" only referred to "rank 2".

Good point, I've clarified it.

355

Given that type($tile) will always be equal to type($result), I wonder whether this wouldn't be cleaner:

let assemblyFormat = [{
    $base `[` $indices `]` `,` $tile `,` $tile_slice_index
      attr-dict `:` type($base) `,` type($tile)
  }];

it was originally like this but i changed the memref to have indices rather than a single index and TypesMatchWith no longer worked coming after Variadic<Index>, I've re-introduced this but moved indices after the tileto get it to work.

mlir/lib/Conversion/ArmSMEToSCF/ArmSMEToSCF.cpp
72

[nit] "Init" in tileInit suggests that something might be initialised. Perhaps tileIdAsVector?

the idea here was to get an initial tile, I've renamed it to tile.

90

Can you expand?

This is fixed in D156689

mlir/test/Dialect/ArmSME/invalid.mlir
77–81 ↗(On Diff #545630)

Similar test for arm_sme.store_tile_slice? If we preserve this assembly format.

Format has changed but this op doesn't have a result type anyway

c-rhodes updated this revision to Diff 545737.Jul 31 2023, 10:38 AM
c-rhodes marked 4 inline comments as done.

Remove unnecessary cast_vector_to_tile op at the end of tile_load conversion.

c-rhodes marked an inline comment as done.Jul 31 2023, 10:40 AM
c-rhodes added inline comments.
mlir/lib/Conversion/ArmSMEToSCF/ArmSMEToSCF.cpp
102–103

Is this 2nd cast really needed? Wouldn't this work:

rewriter.replaceOp(tileLoadOp, tileInit);

? I'm probably missing something obvious 🤔 .

You're right this isn't necessary, memory semantics prevent these from being reordered. I've removed it.

dcaballe accepted this revision.Jul 31 2023, 10:41 PM

Thanks for addressing the feedback!

This revision was landed with ongoing or failed builds.Aug 1 2023, 1:20 AM
This revision was automatically updated to reflect the committed changes.
c-rhodes marked an inline comment as done.