Page MenuHomePhabricator

[mlir][RFC] Add scalable dimensions to VectorType

Authored by jsetoain on Oct 14 2021, 9:51 AM.


With VectorType supporting scalable dimensions, we don't need many of
the operations currently present in ArmSVE, like mask generation and
basic arithmetic instructions. Therefore, this patch also gets
rid of those.

Having built-in scalable vector support also simplifies the lowering of
scalable vector dialects down to LLVMIR.

Scalable dimensions are indicated with the scalable dimensions
between square brackets:


Is a scalable vector of 4 single precission floating point elements.

More generally, a VectorType can have a set of fixed-length dimensions
followed by a set of scalable dimensions:


Is a vector with 2 scalable 4x4 vectors of single precission floating
point elements.

The scale of the scalable dimensions can be obtained with the Vector

        %vs = vector.vscale

This change is being discussed in the discourse RFC:

Differential Revision:

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

This direction makes a lot of sense to me, if we want to avoid code dup between the upcoming vector specific dialects (SVE and RISC-V at the moment).
Since this touches "core", however, I hope others chime in too.

My main concern is that the discussion on Discourse does not seem to have a conclusion right now.

This direction makes a lot of sense to me, if we want to avoid code dup between the upcoming vector specific dialects (SVE and RISC-V at the moment).
Since this touches "core", however, I hope others chime in too.

My main concern is that the discussion on Discourse does not seem to have a conclusion right now.

My apologies. We did have a conversation this week but I didn't update the RFC (I have now). Although some details are still in the air, the need for a built-in type is not in question. This is just the first, most basic implementation possible.

jsetoain updated this revision to Diff 384043.Nov 2 2021, 4:32 AM
jsetoain marked 16 inline comments as done.

Address reviewers' comments

jsetoain marked an inline comment as done.Nov 2 2021, 4:34 AM

I've addressed all the comments.

jsetoain edited the summary of this revision. (Show Details)Nov 2 2021, 4:41 AM
jsetoain updated this revision to Diff 384491.Nov 3 2021, 9:32 AM

Fixed formatting

jsetoain updated this revision to Diff 386113.Nov 10 2021, 3:44 AM

Rebase on main

jsetoain updated this revision to Diff 389149.Nov 23 2021, 3:35 AM

Rebase on main

jsetoain updated this revision to Diff 390086.Nov 26 2021, 9:28 AM

Rebase on main

rriddle added inline comments.Nov 30 2021, 4:12 PM

The get here is a bit weird, why not something like isScalableVectorType?


You could also drop the trailing type if you want, it can be inferred. (i.e. = "attr-dict")


Is this wrapped at 80 characters?


Why the extra spaces?


Same here.


and here.

jsetoain updated this revision to Diff 390963.Dec 1 2021, 2:39 AM
jsetoain marked 10 inline comments as done.

Address review comments

jsetoain edited the summary of this revision. (Show Details)Dec 1 2021, 2:45 AM

What's the next step on this? Seems like the RFC discussion got to a resolution right?

@nicolasvasilache @ftynse : can you chime in on the change in mlir/include/mlir/IR/ ?

My apologies for the long delay, my biggest problem atm is I do not have a good mental model of how RVV and Arm SVE operate in detail.
I had started to read specs but it invariably gets pushed back on the stack as it is not high on my priority list ..

From a pure cleanup perspective, I generally like it.

From a composability perspective, I think I would prefer to have it spelled out as vector<<4>xf32> or vector<4*xf32> or vector<(4s)xf32>.
The rationale is that I think we still want to have n-D scalable vector types in MLIR to allow expressing a statically known number of 1-D scalable vectors that serves as an "unroll-and-jammed vector pack" vector<8x4*xf32>.
This would be more consistent with the design of the rest of the vector dialect.

One thing that is higher priority to me personally these days is that we are also exploring using the vector dialect as a programming model for GPUs.
In this context, vector<4x8x16*x32*xf32> would also make sense for us.

Bottom line, if we avoided anchoring on the current LLVM / HW implementation that only support 1-D scalable vectors and we made it future-proof in that direction, I am fine with proceeding.

jsetoain updated this revision to Diff 391762.Dec 3 2021, 3:38 PM

Generalize the concept of scalable dimensions to support use cases unrelated to scalable vectors

jsetoain retitled this revision from [mlir][RFC] Make scalable vector type a built-in type to [mlir][RFC] Add scalable dimensions to VectorType.Dec 3 2021, 3:39 PM
jsetoain edited the summary of this revision. (Show Details)

I've seen this, I'll take care of it together with any other necessary fix.


It feels a bit "naked", but it might be because I'm used to see it with the return type attached. We can give it a go and see what people think, if people don't care, going "concise" is my preferred option. Is there a "good practices" manual for dialect syntax? I can't find one.


And this...


Not sure what happened there. Good catch, thanks!


Arg! That was embarrassing... Sorry about that!

Mostly LGTM, I added 3 areas of improvements.
Once these are addressed I'll happily accept.
Thanks for your hard work and patience!


Should we call this vector.vscale ?


I would emphasize that this is for 1-D scalable vectors and that there is currently no way to extract the scale
for a >1-D scalable vectors.
This instruction may be extended in the future to take a position but I am unclear whether this is what we want atm.

I think the global vs local property of vscale should also be discussed here.

I'd maybe even go as far as spelling it in the future?

Edit: as I read deeper through the PR, I am now unclear whether vector<[2x8]xf32> is the same as vector<[2]x[8]xf32> ?

I think vector<[2x8]xf32> would make sense for SVE in MLIR (and would then get flattened to 1-D going through LLVM).
In the future we may also want vector<[2]x[8]xf32> for GPUs but this is not the same representation?
Is this what you have in mind ?

In any case, please propose a few wording changes to integrate the relevant parts of my comments and disregard/add a TODO for the others :)


I fear this will prove annoying to use in practice ..

Could we go with unsigned numScalableDims?

Then you can just use APIS such as ArrayRef's shape.take_back(numScalableDims); and friends.


this would get nicer with numScalableDims.


Now that I read this, I am unclear whether vector<[2x8]xf32> is the same as vector<[2]x[8]xf32>, I would think not and the latter form could be a future extension (if so, add a TODO)?
This really depends on whether you think you can make use of vector<[2x8]xf32> in MLIR instead of having to represent as vector<[16]xf32>; I claim you would have a bunch of nice use cases for this (coupled with the shape_cast op once properly extended).


I would expect to see a test file (somwhere in the builtin stuff) where you have both:

  • negative tests for various failure modes of misuses of scalable vectors (with appropriate error messages)
  • positive tests with multi-dim multi-scale vector (atm everything I see is 0-dim 1-scale only).

In a followup PR, I'd love to see a 1-dim, 2-scale version of the neon 2d dot (or something equivalent) and see it lower to unrolled LLVM.

Also signal boosting for @ThomasRaoux @dcaballe @springerm ; no need to review but be aware that this is coming.

rriddle added inline comments.Dec 6 2021, 12:35 AM

Can we use Optional here instead? -1 is a bit magic.


Why not just isScalable? The naming here is a bit weird.


Please cache the end iterator to avoid recomputing it every iteration.


Looks like this is missing test coverage.

jsetoain added inline comments.Dec 6 2021, 4:11 AM

It is, indeed, very much not the same. I find it useful to think about something like [2x8] as a series of 2x8 blocks, one after another. Therefore, even though they would have the same memory requirements, [2x8], [8x2], [4x4], and [16] can represent different data arrangements when you're loading your data from memory. From that point of view, [2]x[8] can't be the same as [2x8] even if the scale for both dimensions is the same. In fact, I don't think something like [2]x[8] makes sense in the context of scalable vectors. For GPU thread blocks, the situation is different. I'm not involved with that work so I can't come up with anything on the spot, but I intuit it could have potentially useful cases. As this work progresses, I suspect we will need to come back to it.

jsetoain updated this revision to Diff 393535.Dec 10 2021, 10:42 AM
jsetoain marked 9 inline comments as done.

Addressed latest round of reviews

I've also moved a bunch of tests around. Scalable vector tests that are not SVE-specific have been moved either to Arithmetic or Vector, depending on their nature.


That's not exactly right. You can have a 2D scalable vector, and vscale represents its multiplicity, but you can't have a 2D scalable vector with two different scales (which we might want to have for GPUs). As it is, we can't represent those yet, so I don't think we need to clarify that in the description. I've added the multi-dimensional multi-scale vector and local/global scale to a TODO for future reference.


Not sure how to use Optional for Types, this is the only way I found to provide a default value in a type builder. In any case, I've changed it to "numScalableDims" as suggested by Nicolas. It makes code a bit less awkward and conveniently replaces a arguably ugly "first dimension = -1" to a more semantically sensible "number of dimensions = 0". If you still find this unacceptable, I can look into adding an "Optional" equivalent for types.


RE follow-up PR, that was in my low priority TODO list, I'll move it to the main TODO, it should be a quick and easy change.

jsetoain edited the summary of this revision. (Show Details)Dec 10 2021, 10:45 AM
rriddle accepted this revision.Dec 10 2021, 7:27 PM

LG from my point-of-view, but also get an LGTM from Nicolas as well.


Can you move this into the documentation of the op? This seems useful to expose in the user facing docs.


I don't think we have a "good practices" manual, though that sounds useful.


If a verifier isn't necessary, you can just ignore it.


nit: Prefer pre-increment unless you need post increment behavior.


Same here.



This revision is now accepted and ready to land.Dec 10 2021, 7:27 PM
jsetoain updated this revision to Diff 393961.Dec 13 2021, 10:39 AM
jsetoain marked 7 inline comments as done.

Address reviewer comments

Addressed comments.

Thanks for getting these through @jsetoain !

jsetoain updated this revision to Diff 394198.Dec 14 2021, 3:16 AM

Rebase on main

This revision was automatically updated to reflect the committed changes.