This revision connects the generated sparse code with an actual
sparse storage scheme, which can be initialized from a test file.
Lacking a first-class citizen SparseTensor type (with buffer),
the storage is hidden behind an opaque pointer with some "glue"
to bring the pointer back to tensor land. Rather than generating
sparse setup code for each different annotated tensor (viz. the
"pack" methods in TACO), a single "one-size-fits-all" implementation
has been added to the runtime support library. Many details and
abstractions need to be refined in the future, but this revision
allows full end-to-end integration testing and performance
benchmarking (with on one end, an annotated Lingalg
op and, on the other end, a JIT/AOT executable).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
It is a very nice PR, thanks for the great work! I finish one round of reading the code and haven't read the MLIR tests yet. Would like to send out my comments I have so far.
mlir/lib/Dialect/Linalg/Transforms/Sparsification.cpp | ||
---|---|---|
524 | I think the second line of the comment is not necessary, as this routine is checking whether the operand is in the form of a sparse tensor storage representation and dense content tensor certainly can be represented via sparse tensor storage. | |
534–535 | Sorry I don't understand the last part of the comment starting from "or ...". | |
578 | I think we need to remove this line. Because in line 618, the way allDense is used, it should just correspond to whether the tensor is in SparseTensorStorage representation. Am I right here? If I am right here, I would recommend the consolidate the two variables, atSparse and allDense, to one variable isSparseTensor. | |
mlir/lib/ExecutionEngine/SparseUtils.cpp | ||
106 | implement | |
128 | Initializes | |
168 | pass | |
288–289 | struct |
Thanks for the speedy review, Bixia!
mlir/lib/Dialect/Linalg/Transforms/Sparsification.cpp | ||
---|---|---|
524 | Okay, removed. It was to contrast "true" dense tensors (i.e. the ones that are marked as such in the annotations and are passed true tensor types) for "test" dense tensors (i.e. marked as dense but passed an opaque pointer that is set up with an artificial all-dense sparse storage scheme just for stress testing). But I removed the comment. | |
534–535 | Set to your suggestion. I was trying to give too much background information here, but got lost in the woods of details. | |
578 | Good observation, but this is a bit tricky. There are two sources for sparsity: the annotations, and the fact that we feed in a "glue" node from opaque pointer to tensor. The latter is relatively new, and is only temporary (this will go away once we have a proper sparse type). So most tests will still need the allDense to be set based on annotations, but the opague pointer needs the different DimOp. Once we have one type, this will indeed converge to one again. |
Looks okay in general. I would like some higher-level description of the storage scheme somewhere.
Please do not commit until @nicolasvasilache had an opportunity to look at Linalg ops.
mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td | ||
---|---|---|
525 ↗ | (On Diff #321014) | All these could use some examples in let description and a better specification of what the underlying assumptons are. |
539 ↗ | (On Diff #321014) | Nit: we use arg-type to result-type for cast-like operations, better to be consistent. |
552 ↗ | (On Diff #321014) | Are we sure this are going to be side effect-free in all cases? It seems to assume a specific underlying storage scheme for tensors, at which point it may be possible to just extract a pointer and wrap it into a memref. If the actual scheme is different, it may be allocating a memref and copying data into it... |
575 ↗ | (On Diff #321014) | Array sounds like it is 1D. |
576 ↗ | (On Diff #321014) | I feel like the there should be more invariants about some of these ops. Can it really convert _any_ tensor into _any_ strided memref? 4D tensor into a 2D column-major memref? |
mlir/lib/Dialect/Linalg/Transforms/SparseLowering.cpp | ||
7 | Something went wrong here. | |
26–28 | If you take ValueRange instead of ArrayRef<Value> (which you anyway should unless there is a strong reason not to), you can drop this and call operands.getTypes(). | |
78–82 | Nit: LLVM-style early return would be something like if (!eltType.isIndex() && !eltType.isInteger(64)) return rewriter.notifyMatchFailure("unsupported element type"); StringRef name = "sparsePtrsI64"; | |
mlir/lib/Dialect/Linalg/Transforms/Sparsification.cpp | ||
526 | Nit: we have isa_and_nonnull to avoid the actual cast when you don't need the result | |
mlir/lib/ExecutionEngine/SparseUtils.cpp | ||
111 | Nit: we prefer taking non-nullable mutable arguments by reference | |
132 | Do we want to make this function private? | |
175 | Maybe make the element type a template parameter of the enclosing class? | |
297 | MemRef1DU64 plz, we may need other ranks. |
mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td | ||
---|---|---|
575 ↗ | (On Diff #321014) | I think it is 1D. See also my response to the comment below. |
576 ↗ | (On Diff #321014) | I think the answer is YES, and the stride is trivial 1. The three Linalg_SparseTensorToXYZMemRefOp here are converting the SparseTensorStorage.positions[x]/indices[y]/values to MemRefOp as can be seen from the C APIs from the same source file. |
mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td | ||
---|---|---|
575 ↗ | (On Diff #321014) | If it is indeed 1D, than we should say so in the result type constraint or the verifier. |
576 ↗ | (On Diff #321014) | 2D memref have offset, 2 sizes and 2 strides. What are those numbers in case of 4D tensor input? 8D tensor input? Where do they come from? I see how this can be converted to a 1D memref<?xType> with an external indexing scheme (which still requires to verify that the operand and the result types have the same element type rather than accepting _anything_). |
Thanks all for the very detailed and useful feedback!
mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td | ||
---|---|---|
525 ↗ | (On Diff #321014) | You are right. These ops are probably short lived until we get a SparseType with ops but even then we can carry documentation over. Added. |
552 ↗ | (On Diff #321014) | You are right it is safer to remove this given the opaque implementation for now. It does not matter much for performance anyway, since the sparse compiler hoists these to the top loop always. |
576 ↗ | (On Diff #321014) | Any tensor in (since we still live in dense tensor land), but only 1D arrays out. I added that constraint so legality is verified better. |
mlir/lib/Dialect/Linalg/Transforms/SparseLowering.cpp | ||
7 | Ah, good catch! Thanks. | |
26–28 | Elegant. Thanks. | |
78–82 | Agreed in general, but this is a bit forward looking on adding more types here (which would make the early return a bit harder) | |
mlir/lib/Dialect/Linalg/Transforms/Sparsification.cpp | ||
526 | Also more elegant. Thanks. | |
578 | On second though I don't think we needed the specialized DimOp and thus this logic. Simplified again. | |
mlir/lib/ExecutionEngine/SparseUtils.cpp | ||
175 | You are way ahead of me. This class will indeed be templated by value type, index type and pointer type in the future (for now I just showed how the default f64/index/index case would be lowered. But added this just to indicate the future direction. |
Alex @ftynse, it would also be nice to use the memref in the input parameters, for example
void *newSparseTensor(char *filename, MemRef1DI1 ref) { assert(ref.stride == 1); return newSparseTensorC(filename, ref.base + ref.off); }
instead of
void *newSparseTensor(char *filename, bool *abase, bool *adata, uint64_t aoff, uint64_t asize, uint64_t astride) { assert(astride == 1); return newSparseTensorC(filename, abase + aoff); }
but this seems ABI incompatible with the ways arguments are passed C style for structs.
Do you know of any clever trick to make this work? It would read easiser....
Add the llvm.emit_c_interface attribute to the relevant functions, this will generate _mlir_ciface_<original_name> wrappers that take a pointer to the struct, unpack it, and forward the descriptor components to the actual call. More info here https://mlir.llvm.org/docs/LLVMDialectMemRefConvention/#c-compatible-wrapper-emission
Ah, thanks for the pointer to a detailed description (I had seen examples of this). This wrapper nicely hides these details, but I was really hoping for some direct C magic to either let MLIR pass by struct or C to scalarize the struct :-)
This is quite tricky as the way C-level types map to the LLVM level is dependent on the ABI, and all this logic is builtin clang itself. It would be a nice project to extract this from clang and make it reusable though :)
@nicolasvasilache since you were explicitly called out, would you mind having a look at the Linalg ops.
I suspect you are okay, since these form a short-lived glue until we have a sparse tensor type, but perhaps you have some other insights?
Please introduce the new ops independently with roundtrip, verifier and independent testing so their behavior is well-understood + tested and not just assumed by a single-file compiler.
The amount of tests involved here is quite scary to me: I assume it is well-understood that they will be unsupported in the future as the proper refactoring work starts?
mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td | ||
---|---|---|
453 ↗ | (On Diff #322234) | Let's please isolate these ops in a new LinalgSparseOps.td for now. |
491 ↗ | (On Diff #322234) | tensor_to_memref folds away as bufferization occurs. At lowering time I see: `// Lower sparse primitives to calls into runtime support library.` Please improve the doc so we understand what this does in practice. The way we proceed in MLIR for adding such ops is to add them one by one with proper roundtrip and invalid tests + in your case execution tests. Once the op semantics are well-understood and tested in isolation, composition becomes possible. I am not pasting this for each op but please do treat it as a comment for all ops that actually materialize into executable IR: they need to be introduced independently with all the supporting testing / semantics. |
mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td | ||
---|---|---|
453 ↗ | (On Diff #322234) | I have moved this into its own file now, and added more documentation to the ops themselves (but without going to the public Linalg doc for now). I also added roundtrip and isolated lowering tests. |
491 ↗ | (On Diff #322234) | I modified the doc and added some diagram at the top of the new file. |
Thanks Bixia!
mlir/include/mlir/Dialect/Linalg/IR/LinalgSparseOps.td | ||
---|---|---|
21 | Right you are! Thanks for catching. |
mlir/lib/Dialect/Linalg/Transforms/SparseLowering.cpp | ||
---|---|---|
120–123 | Nit: invert the condition and return failure() and drop the else. | |
mlir/lib/Dialect/Linalg/Transforms/Sparsification.cpp | ||
532 | Inputs / read-only args should be passed by const reference. const CodeGen &codegen? | |
mlir/lib/ExecutionEngine/SparseUtils.cpp | ||
173–175 | All of these are missing doc comments - triple doc comment instead of the trailing one. | |
401–404 | Regular // comment block. |
Thanks Nicolas and Bixia. I saw some last minute feedback came in after submitting, will address that in follow up changes.
mlir/lib/Dialect/Linalg/Transforms/SparseLowering.cpp | ||
---|---|---|
120–123 | This was suggested earlier too, but there will be more types added later (else ifs), where this kind of flow makes more sense. | |
mlir/lib/Dialect/Linalg/Transforms/Sparsification.cpp | ||
532 | I will make a pass to see where we can safely add const | |
mlir/lib/ExecutionEngine/SparseUtils.cpp | ||
401–404 | Ok will change |
I see and extra "em" in this word.