With VectorType supporting scalable dimensions, we don't need many of the operations currently present in ArmSVE, like mask generation and basic arithmetic instructions. Therefore, this patch also gets rid of those. Having built-in scalable vector support also simplifies the lowering of scalable vector dialects down to LLVMIR. Scalable dimensions are indicated with the scalable dimensions between square brackets: vector<[4]xf32> Is a scalable vector of 4 single precission floating point elements. More generally, a VectorType can have a set of fixed-length dimensions followed by a set of scalable dimensions: vector<2x[4x4]xf32> Is a vector with 2 scalable 4x4 vectors of single precission floating point elements. The scale of the scalable dimensions can be obtained with the Vector operation: %vs = vector.vscale This change is being discussed in the discourse RFC: https://llvm.discourse.group/t/rfc-add-built-in-support-for-scalable-vector-types/4484 Differential Revision: https://reviews.llvm.org/D111819
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This direction makes a lot of sense to me, if we want to avoid code dup between the upcoming vector specific dialects (SVE and RISC-V at the moment).
Since this touches "core", however, I hope others chime in too.
mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h | ||
---|---|---|
449 | period at end | |
450 | would IsScalableVectorType be a bit more consistent with naming? | |
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td | ||
1123 ↗ | (On Diff #379753) | compile-time |
1125 ↗ | (On Diff #379753) | comma, no : Also, is this not better an e.g.? For example, .... |
mlir/include/mlir/IR/BuiltinTypes.td | ||
905 | Perhaps you can add some text calling out that < > is fixed length and << >> is scalable? | |
921 | period at end | |
924 | period at end | |
mlir/include/mlir/IR/OpBase.td | ||
660 | I wanted to say period at end, but I see that is not really the style in this file |
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td | ||
---|---|---|
1118–1120 ↗ | (On Diff #379753) | Why would this be here and not in say, the vector dialect? |
mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp | ||
554–555 | Drop else after return. | |
568–570 | ||
587–588 | Drop else after return. | |
mlir/lib/IR/BuiltinTypes.cpp | ||
297 | Use cast if you aren't checking the result, dyn_cast> can return null. | |
320 | Same here. | |
344 | And here, and others. | |
mlir/lib/Target/LLVMIR/TypeToLLVM.cpp | ||
147–148 | Drop else after return. |
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td | ||
---|---|---|
1118–1120 ↗ | (On Diff #379753) | Indeed, my first instinct was to put it in Vector, but since this value is tightly coupled with the type itself, which is _not_ part of Vector, it feels a bit out of place there as well. I believe that the Vector type not being part of the Vector dialect is what creates the situation. As things are, if there's a place to put runtime constants, that's where this should go. Alternatively, if there's a way to express runtime properties of a type, say: %0 = vector<<>>.scale : index That also looks somewhat right (if a bit ugly). In any case, Standard or Vector, neither place looks better than the other to me, if you see clearly that this makes more sense in Vector, I can move it there and everything else works the same. This is something I actually wanted feedback about, thanks :-) |
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td | ||
---|---|---|
1118–1120 ↗ | (On Diff #379753) | I don't think it fits well in standard though, given that the standard dialect is going away. It seems like a better home should be found for this. |
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td | ||
---|---|---|
1118–1120 ↗ | (On Diff #379753) | Agree. I'll move it to Vector as a non-entirely-terrible option and, should a better place become apparent, we can reconsider in the future. It's not intertwined with anything else, it'd be a quite innocuous change. Thanks! |
This direction makes a lot of sense to me, if we want to avoid code dup between the upcoming vector specific dialects (SVE and RISC-V at the moment).
Since this touches "core", however, I hope others chime in too.
My main concern is that the discussion on Discourse does not seem to have a conclusion right now.
My apologies. We did have a conversation this week but I didn't update the RFC (I have now). Although some details are still in the air, the need for a built-in type is not in question. This is just the first, most basic implementation possible.
mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h | ||
---|---|---|
450 | The get here is a bit weird, why not something like isScalableVectorType? | |
mlir/include/mlir/Dialect/Vector/VectorOps.td | ||
2404–2405 | You could also drop the trailing type if you want, it can be inferred. (i.e. = "attr-dict") | |
mlir/include/mlir/IR/BuiltinTypes.td | ||
941–942 | Is this wrapped at 80 characters? | |
mlir/lib/Dialect/ArmSVE/IR/ArmSVEDialect.cpp | ||
51 | ||
mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp | ||
161 | Why the extra spaces? | |
mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp | ||
551–557 | ||
mlir/lib/IR/BuiltinTypes.cpp | ||
295–298 | ||
318–319 | Same here. | |
342–343 | and here. | |
mlir/lib/Parser/TypeParser.cpp | ||
456–458 |
What's the next step on this? Seems like the RFC discussion got to a resolution right?
@nicolasvasilache @ftynse : can you chime in on the change in mlir/include/mlir/IR/BuiltinTypes.td ?
My apologies for the long delay, my biggest problem atm is I do not have a good mental model of how RVV and Arm SVE operate in detail.
I had started to read specs but it invariably gets pushed back on the stack as it is not high on my priority list ..
From a pure cleanup perspective, I generally like it.
From a composability perspective, I think I would prefer to have it spelled out as vector<<4>xf32> or vector<4*xf32> or vector<(4s)xf32>.
The rationale is that I think we still want to have n-D scalable vector types in MLIR to allow expressing a statically known number of 1-D scalable vectors that serves as an "unroll-and-jammed vector pack" vector<8x4*xf32>.
This would be more consistent with the design of the rest of the vector dialect.
One thing that is higher priority to me personally these days is that we are also exploring using the vector dialect as a programming model for GPUs.
In this context, vector<4x8x16*x32*xf32> would also make sense for us.
Bottom line, if we avoided anchoring on the current LLVM / HW implementation that only support 1-D scalable vectors and we made it future-proof in that direction, I am fine with proceeding.
Generalize the concept of scalable dimensions to support use cases unrelated to scalable vectors
mlir/include/mlir/Dialect/Vector/VectorOps.td | ||
---|---|---|
2399 | I've seen this, I'll take care of it together with any other necessary fix. | |
2404–2405 | It feels a bit "naked", but it might be because I'm used to see it with the return type attached. We can give it a go and see what people think, if people don't care, going "concise" is my preferred option. Is there a "good practices" manual for dialect syntax? I can't find one. | |
mlir/include/mlir/IR/BuiltinTypes.td | ||
907 | And this... | |
941–942 | Not sure what happened there. Good catch, thanks! | |
mlir/lib/Parser/TypeParser.cpp | ||
456–458 | Arg! That was embarrassing... Sorry about that! |
Mostly LGTM, I added 3 areas of improvements.
Once these are addressed I'll happily accept.
Thanks for your hard work and patience!
mlir/include/mlir/Dialect/Vector/VectorOps.td | ||
---|---|---|
2390 | Should we call this vector.vscale ? | |
2394 | I would emphasize that this is for 1-D scalable vectors and that there is currently no way to extract the scale I think the global vs local property of vscale should also be discussed here. I'd maybe even go as far as spelling it vector.scale.global in the future? Edit: as I read deeper through the PR, I am now unclear whether vector<[2x8]xf32> is the same as vector<[2]x[8]xf32> ? I think vector<[2x8]xf32> would make sense for SVE in MLIR (and would then get flattened to 1-D going through LLVM). In any case, please propose a few wording changes to integrate the relevant parts of my comments and disregard/add a TODO for the others :) | |
mlir/include/mlir/IR/BuiltinTypes.h | ||
323 | I fear this will prove annoying to use in practice .. Could we go with unsigned numScalableDims? Then you can just use APIS such as ArrayRef's shape.take_back(numScalableDims); and friends. | |
342 | this would get nicer with numScalableDims. | |
mlir/include/mlir/IR/BuiltinTypes.td | ||
928 | Now that I read this, I am unclear whether vector<[2x8]xf32> is the same as vector<[2]x[8]xf32>, I would think not and the latter form could be a future extension (if so, add a TODO)? | |
mlir/test/Dialect/ArmSVE/legalize-for-llvm.mlir | ||
1 | I would expect to see a test file (somwhere in the builtin stuff) where you have both:
In a followup PR, I'd love to see a 1-dim, 2-scale version of the neon 2d dot (or something equivalent) and see it lower to unrolled LLVM. |
Also signal boosting for @ThomasRaoux @dcaballe @springerm ; no need to review but be aware that this is coming.
mlir/include/mlir/IR/BuiltinTypes.td | ||
---|---|---|
942 | Can we use Optional here instead? -1 is a bit magic. | |
960–961 | Why not just isScalable? The naming here is a bit weird. | |
mlir/lib/IR/AsmPrinter.cpp | ||
1942 | Please cache the end iterator to avoid recomputing it every iteration. | |
mlir/lib/Parser/TypeParser.cpp | ||
522 | Looks like this is missing test coverage. |
mlir/include/mlir/IR/BuiltinTypes.td | ||
---|---|---|
928 | It is, indeed, very much not the same. I find it useful to think about something like [2x8] as a series of 2x8 blocks, one after another. Therefore, even though they would have the same memory requirements, [2x8], [8x2], [4x4], and [16] can represent different data arrangements when you're loading your data from memory. From that point of view, [2]x[8] can't be the same as [2x8] even if the scale for both dimensions is the same. In fact, I don't think something like [2]x[8] makes sense in the context of scalable vectors. For GPU thread blocks, the situation is different. I'm not involved with that work so I can't come up with anything on the spot, but I intuit it could have potentially useful cases. As this work progresses, I suspect we will need to come back to it. |
I've also moved a bunch of tests around. Scalable vector tests that are not SVE-specific have been moved either to Arithmetic or Vector, depending on their nature.
mlir/include/mlir/Dialect/Vector/VectorOps.td | ||
---|---|---|
2394 | That's not exactly right. You can have a 2D scalable vector, and vscale represents its multiplicity, but you can't have a 2D scalable vector with two different scales (which we might want to have for GPUs). As it is, we can't represent those yet, so I don't think we need to clarify that in the description. I've added the multi-dimensional multi-scale vector and local/global scale to a TODO for future reference. | |
mlir/include/mlir/IR/BuiltinTypes.td | ||
942 | Not sure how to use Optional for Types, this is the only way I found to provide a default value in a type builder. In any case, I've changed it to "numScalableDims" as suggested by Nicolas. It makes code a bit less awkward and conveniently replaces a arguably ugly "first dimension = -1" to a more semantically sensible "number of dimensions = 0". If you still find this unacceptable, I can look into adding an "Optional" equivalent for types. | |
mlir/test/Dialect/ArmSVE/legalize-for-llvm.mlir | ||
1 | RE follow-up PR, that was in my low priority TODO list, I'll move it to the main TODO, it should be a quick and easy change. |
LG from my point-of-view, but also get an LGTM from Nicolas as well.
mlir/include/mlir/Dialect/Vector/VectorOps.td | ||
---|---|---|
2390 | Can you move this into the documentation of the op? This seems useful to expose in the user facing docs. | |
2404–2405 | I don't think we have a "good practices" manual, though that sounds useful. | |
2415 | If a verifier isn't necessary, you can just ignore it. | |
mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp | ||
31 | nit: Prefer pre-increment unless you need post increment behavior. | |
38 | Same here. | |
mlir/lib/IR/AsmPrinter.cpp | ||
1942 | Unresolved. |
period at end