This is an archive of the discontinued LLVM Phabricator instance.

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
671	Technically, we also fall into this branch for the strange dense to dense conversion. We will have typically folded those away, but I would not completely rely on this having taken place always and defend against that in the code.
677–679	Note that in the context of another project, we may migrate library code to actual codegen (which has the advantage of a smaller memory footprint potentially and allows for "unforeseen" type combinations). Such a migration may take care of all such performance concerns.
685	I think we need a subtle rewriting of the new call utility since we need the sparse encoding for some info, but the id permutation for the "new" tensor (ie. the dense result)
mlir/lib/ExecutionEngine/SparseUtils.cpp
147–148	this sentence does not flow quite well, or am I reading it wrong?
650–651	We really should not need permutation here. If you call toCOO with ID permutation, it restores the indices to original order. Note that toCOO takes the permutation of the target, and internally restores permutation, if there was one from source (the internally stored inverse permutation). So if you call toCOO with ID perm, you don't need the perm here anymore, since indices are in the natural "dense" order MLIR expects.

Also, please add tests for this

(1) in mlir/test/Dialect/SparseTensor conversion, a CHECK test on the expected loop structure (see examples there)
(2) in mlir/test/Integration/Dialect/SparseTensor/CPU, add a new "end-to-end" test that does a dense->sparse->dense roundtrip

wrengr mentioned this in rGca010347145d: [mlir][sparse] Factoring out getZero() and avoiding unnecessary Type params.Oct 1 2021, 2:18 PM

wrengr mentioned this in rG14fffda979ae: [mlir][sparse] Factoring out allocaIndices().

wrengr mentioned this in rGaf7ac1d95b7d: [mlir][sparse] Sharing calls to adaptor.getOperands()[0].Oct 1 2021, 2:20 PM

bixia added inline comments.Oct 1 2021, 3:12 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
685	this should be encSrc, not encDst, right?
mlir/lib/ExecutionEngine/SparseUtils.cpp
112	nested

wrengr marked 2 inline comments as done.Oct 1 2021, 3:30 PM

wrengr added inline comments.

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
671	Just to be clear, that should only ever happen when the source is a "sparse" tensor that happens to use dense storage for all dimensions, right? Which is to say, the destination is always a bog-standard dense tensor, right? If so then the code should still work; if not then we'll have to figure out how to tighten up the guards for detecting the different cases. I'll update the commentary (on the assumption that the answer to the first question is yes; of course, if that's true, then shouldn't the "sparse=>sparse" case have the same caveat?)
677–679	Yep, that's what I was thinking re "good enough for now" :) I mainly added the comment since when discussing the design with Tatiana, she raised some concerns about the performance implications of the method/function calls. For our current goals, I can't imagine this branch would be taken often enough to constitute a performance bottleneck. (And whenever it does, it's easy enough to fix at that point.)
685	Yeah, I've been mulling over a few ways to clean genNewCall up. Did you want me to do that in/before this differential, or is it okay to do afterwards?
mlir/lib/ExecutionEngine/SparseUtils.cpp
147–148	I'll try rewording
650–651	Oh good. I thought that's how things worked, but I wasn't quite certain.

Addressing comments

Harbormaster completed remote builds in B126751: Diff 376644.Oct 1 2021, 3:32 PM

aartbik added inline comments.Oct 4 2021, 9:14 AM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
671	No, I really means no encoding at all. There is a subtle difference between an un-annotated tensor and a tensor with all-dense annotations. All conversions work for the all-dense annotated case (it is treated as a sort of sparse tensors). But the logic on falling into a branch based on encDst, !encDest, srcDest, !srcDest (so four truth values) fell into this branch for two cases, but you only implemented the sparse->dense. So you will have to add one if-test and return failure.

Re-updating commentary about covering dense=>sparse not dense=>dense. Also some preliminary WIP towards adding tests.

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
671	Got it

wrengr marked 2 inline comments as not done.Oct 4 2021, 2:09 PM

Harbormaster completed remote builds in B126929: Diff 377031.Oct 4 2021, 3:06 PM

Debugging some errors

wrengr added inline comments.Oct 4 2021, 3:52 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
685	Good catch :) That lead to an utterly inscrutable crash stacktrace

Harbormaster completed remote builds in B126937: Diff 377045.Oct 4 2021, 4:17 PM

wrengr updated this revision to Diff 377317.Oct 5 2021, 11:33 AM

(more debugging attempt; not much progress)

Harbormaster completed remote builds in B127125: Diff 377317.Oct 5 2021, 11:47 AM

aartbik added inline comments.Oct 5 2021, 1:40 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
606	using the "zero" data value itself has a bit of a risk that a sparse data structure with a few explicitly stored zeros will bail this loop too soon; having a 0/1 result and passing the value as another ref parameter seems safer
632	you cannot just pass "indices" to next call and use it here; you will need the IR to load the contents returned through the memref by the getNext() call, using explicit memref::LoadOp ops on the elements in the memref and passing this to the StoreOp
635	You are replacing the op (which returns a tensor) with the result of an alloc (which is a memref). That type mismatch will fail. You need a buffer cast in between.
635	This replacement will also need some legality check changes. Up to now, we were replacing sparse tensors with opague pointers, and the checks/rewriting did all the work But now we have dense_tensor = convert .... return dense_tenor and the mechanism will need some "love" to make it accept the rewriting even though the types were already legal to start with

rebased, and factored out D111763 and D111766

wrengr added inline comments.Oct 14 2021, 2:57 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
606	Gross; but yes, you're right. I think I'll leave the commentary here as-is, however, since —like the field accesses on line 452, and like the commentary for other branches— it's reflecting more what the C++ code in the ExecutionEngine does, rather than reflecting what the generated MLIR code does; and there's no ambiguity about what the `getNext()` method returns, even though `IMPL_COO_GETNEXT` introduces the problem you mention.

wrengr marked an inline comment as done.Oct 14 2021, 2:57 PM

Harbormaster completed remote builds in B128961: Diff 379851.Oct 14 2021, 2:58 PM

Everything should now be ready for review. I've added integration and codegen tests, fixed various infelicities, and rebased.

N.B., I'm planning to add support for dynamic sizes, but I want to do that in a separate differential

Harbormaster completed remote builds in B131047: Diff 382805.Oct 27 2021, 3:10 PM

aartbik added inline comments.Oct 27 2021, 3:40 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
207	returns (to be consistent with Generates) also "as" instead of "at"? (2x)
209	is the "inline" really important (here and below) internal linkage gives the compiler sufficient room to make that decision using built-in heuristics?
358	we typically do not show that much detail of the C++ lib in this file
365	This is not a style we use in MLIR, and especially for an internally linked method, and compared to the surrounding methods, it feels a bit heavy
366	let's not document SparseUtils inside this file
438	not in this revision yet? seems a relatively minor addition (and then we are done for all cases!)
452	same
609–611	remove this; it more or less applies to all method calls here, and it will fall under the general umbrella of perhaps moving to 100% codegen over support lib usage....
615	we don't emit errors when rewriting rules fail to apply
624	all this block scoping makes the code more lengthy than it could be I would either break this up in methods where it makes sense, or otherwise just take the scoping hit for readability
645	shouldn't we bail out if there are dynamic sizes? or, better yet, just add those in this revision
mlir/lib/ExecutionEngine/SparseUtils.cpp
152–153	this class adds an enormous amount of code for a very thin iterator how about just having the very simple start()/next() iteration inside the COO class itself? That way, we can also assert an error if you try to insert while iterating
589	assert(action == kEmpty); disappeared in your rewrite
589	this is of course inside a macro now, but LLVM wants braces {} on all branches if one of them is branched
682	if you use a shorter name, perhaps it fits on one line
795–804	I prefer not to have this function at all; but if we keep it, it should go to CRunnerUtils.cpp since we have some other printing utilities there but this is not related to sparse
mlir/test/Dialect/SparseTensor/conversion_sparse2dense.mlir
16	Ah, you made this much more accurate checking than conversion.mlir does (good). But, let's also add a check in that file in the more informal way there
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir
45	from dense to sparse
64–69	I think you are overthinking this I agree that the silent exit(1) is not good (so perhaps we should just go back to fully FileChecked cases) But how about just printing -12345 on error and exiting and then also add a CHECK-NOT -12345 or something like that but again, I actually prefer the fully FileChecked cases and just rely on printing and verifying the output
132–133	yeah, we need to do that or asan will fail Proper way, I think (1) memref.buffer_cast from tensor to memref (2) dealloc memref

In D110790#3091638, @wrengr wrote:

N.B., I'm planning to add support for dynamic sizes, but I want to do that in a separate differential

Ah, I saw this after writing my comments. So you can ignore the part where I asked about that ;-)

Added child differential D112674 for handling dynamic sizes

wrengr added a child revision: D112674: [mlir][sparse] Adding dynamic-size support for sparse=>dense conversion.Oct 27 2021, 4:08 PM

Harbormaster completed remote builds in B131068: Diff 382836.Oct 27 2021, 4:20 PM

Addressing comments

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
438	It's done in D112674.
645	Done in D112674
685	Ha, actually I was right(-ish) the first time. We want the destination encoding here, so we don't apply the dimOrdering permutation twice. The problem was that I was passing a nullptr rather than explicitly constructing the SparseTensorEncodingAttr
mlir/lib/ExecutionEngine/SparseUtils.cpp
795–804	Yeah, I'd rather not have this function too. Unfortunately, when I tried using the CRunnerUtils.cpp functions that vector.print does, I could never get it to link right: it either complained about re-defining a symbol, or about using an un-defined symbol. As for implementing it in the integration test itself, I can't seem to find a way to define string values (not attributes) for passing to fputs() Moved to CRunnerUtils.cpp for now. Will try to rip it out in a future differential.
mlir/test/Dialect/SparseTensor/conversion_sparse2dense.mlir
16	I'm not sure I follow. I was just trying to do the same as conversion.mlir (but in the opposite direction, naturally); and elsewhere you suggested breaking things out into a separate test rather than reusing the ones already there.

Harbormaster completed remote builds in B131083: Diff 382862.Oct 27 2021, 5:38 PM

aartbik added inline comments.Oct 27 2021, 6:42 PM

mlir/lib/ExecutionEngine/SparseUtils.cpp
795–804	How about just leaving it at the exit(1) for not and think about this in a future differential. Rather than introducing something we want to rip again anyway?
mlir/test/Dialect/SparseTensor/conversion_sparse2dense.mlir
16	Ok, fair enough. I was suggesting to have one sparse->dense check in the conversion.mlir in the more concise style of that test, but having this more rigid tests is indeed sufficient. You can ignore the suggestion ;-)

aartbik added inline comments.Oct 27 2021, 8:42 PM

mlir/lib/ExecutionEngine/CRunnerUtils.cpp
47–58 ↗	(On Diff #382862)	Let's not do this at all, but let's go with just using CHECKs I moved the sparse_conversion test back to this format as well: https://reviews.llvm.org/D112688

Addressing comments

Herald added a subscriber: mgrang. · View Herald TranscriptOct 28 2021, 12:45 PM

Harbormaster completed remote builds in B131271: Diff 383129.Oct 28 2021, 12:54 PM

yeah, looking good. few last nits before we submit...

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
449	Inserts (I usually use the s-form of a verb in the top level comment, but the imperative form in the inlined code)
456–457	is this comment still relevant? I think we can safely remove it?
643	you still have some block scoping left? none of these really release stuff early to keep memory lower so I would opt for readability over block scoping
mlir/lib/ExecutionEngine/SparseUtils.cpp
96	yeah, this is awesome!
121	iteratorLocked = false?
790	feel free to rename this into more intuitive names in a follow up revision too, btw

Addressing nits

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
643	Imo the blocks improve readability rather than detract.
mlir/lib/ExecutionEngine/SparseUtils.cpp
790	Will do

aartbik added inline comments.Oct 28 2021, 1:34 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
617–628	how about removing this comment block and simply using // Setup a synthetic all-dense, no-permutation encoding for the dense destination. encDst = SparseTensorEncodingAttr::get( op->getContext(), SmallVector<SparseTensorEncodingAttr::DimLevelType>( rank, SparseTensorEncodingAttr::DimLevelType::Dense), AffineMap(), 0, 0); we don't need anything copied from src here.

Harbormaster completed remote builds in B131281: Diff 383144.Oct 28 2021, 1:37 PM

wrengr marked an inline comment as done.Oct 28 2021, 1:59 PM

wrengr added inline comments.

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
617–628	We need the (ptrTp,indTp,valTp) values for _mlir_ciface_newSparseTensor to enter the right CASE. But I can abbreviate the commentary

Addressing nits

aartbik added inline comments.Oct 28 2021, 2:09 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
617–628	this last one may have crossed. I just feel L601-612 takes too much real estate explaining what could be a onliner
643	Ok, acceptable ;-)
mlir/lib/ExecutionEngine/SparseUtils.cpp
651	tensor is a bit more consistent name with IMPL3
660	oh, we will leak memory right now! When iteration is done, we should release the COO tensor/ The easiest way would be to do this if (elem == nullptr) { delete iter; return false; } and document that tensor can no longer be used once getNext returns false
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir
80–82	this is taken from the original sparse_conversion.mlir but with your addition it feels like we should (1) remove this comment, or (2) add a comment for tensor4,5,6 as well

aartbik added inline comments.Oct 28 2021, 2:12 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
617–628	Oh yeah, right you are. Okay, then indeed just a bit less comment, but same code ;-)
mlir/lib/ExecutionEngine/SparseUtils.cpp
660	this is the most important issue, since without this fix, asan will break the test

Harbormaster completed remote builds in B131293: Diff 383158.Oct 28 2021, 2:19 PM

wrengr marked 7 inline comments as done.Oct 28 2021, 2:46 PM

Fixing memory leak

Ship it, Wren!

This revision is now accepted and ready to land.Oct 28 2021, 3:02 PM

Harbormaster completed remote builds in B131302: Diff 383170.Oct 28 2021, 3:05 PM

Closed by commit rG28882b6575d2: [mlir][sparse] Implementing sparse=>dense conversion. (authored by wrengr). · Explain WhyOct 28 2021, 3:27 PM

This revision was automatically updated to reflect the committed changes.

wrengr added a commit: rG28882b6575d2: [mlir][sparse] Implementing sparse=>dense conversion..

aartbik mentioned this in D112779: [mlir][sparse] fix broken asan test.Oct 28 2021, 8:43 PM

aartbik mentioned this in rG00040d734960: [mlir][sparse] fix broken asan test.Oct 28 2021, 8:54 PM

wrengr mentioned this in D112854: [mlir][sparse] Improve handling of dynamic-sizes for sparse=>dense conversion.Oct 29 2021, 5:09 PM

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

SparseTensor/

Transforms/

Passes.td

1 line

lib/

Dialect/

SparseTensor/

Transforms/

SparseTensorConversion.cpp

169 lines

SparseTensorPasses.cpp

3 lines

ExecutionEngine/

SparseUtils.cpp

116 lines

test/

Dialect/

SparseTensor/

conversion_sparse2dense.mlir

162 lines

Integration/

Dialect/

SparseTensor/

CPU/

sparse_conversion_sparse2dense.mlir

143 lines

Diff 382805

mlir/include/mlir/Dialect/SparseTensor/Transforms/Passes.td

Show First 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	```mlir
%c1 = arith.constant 1 : index		%c1 = arith.constant 1 : index
%0 = call @sparsePointers(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xindex>		%0 = call @sparsePointers(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xindex>
```		```
}];		}];
let constructor = "mlir::createSparseTensorConversionPass()";		let constructor = "mlir::createSparseTensorConversionPass()";
let dependentDialects = [		let dependentDialects = [
"arith::ArithmeticDialect",		"arith::ArithmeticDialect",
"LLVM::LLVMDialect",		"LLVM::LLVMDialect",
		"linalg::LinalgDialect",
"memref::MemRefDialect",		"memref::MemRefDialect",
"scf::SCFDialect",		"scf::SCFDialect",
"sparse_tensor::SparseTensorDialect",		"sparse_tensor::SparseTensorDialect",
"vector::VectorDialect",		"vector::VectorDialect",
];		];
}		}

#endif // MLIR_DIALECT_SPARSETENSOR_TRANSFORMS_PASSES		#endif // MLIR_DIALECT_SPARSETENSOR_TRANSFORMS_PASSES

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp

Show All 30 Lines

/// New tensor storage action. Keep these values consistent with		/// New tensor storage action. Keep these values consistent with
/// the sparse runtime support library.		/// the sparse runtime support library.
enum Action : uint32_t {		enum Action : uint32_t {
kEmpty = 0,		kEmpty = 0,
kFromFile = 1,		kFromFile = 1,
kFromCOO = 2,		kFromCOO = 2,
kEmptyCOO = 3,		kEmptyCOO = 3,
kToCOO = 4		kToCOO = 4,
		kToIter = 5
};		};

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Helper methods.		// Helper methods.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Returns internal type encoding for primary storage. Keep these		/// Returns internal type encoding for primary storage. Keep these
/// values consistent with the sparse runtime support library.		/// values consistent with the sparse runtime support library.
▲ Show 20 Lines • Show All 149 Lines • ▼ Show 20 Lines	static void sizesFromPtr(ConversionPatternRewriter &rewriter,
auto shape = stp.getShape();		auto shape = stp.getShape();
for (unsigned i = 0, rank = stp.getRank(); i < rank; i++)		for (unsigned i = 0, rank = stp.getRank(); i < rank; i++)
if (shape[i] == ShapedType::kDynamicSize)		if (shape[i] == ShapedType::kDynamicSize)
sizes.push_back(genDimSizeCall(rewriter, op, enc, src, i));		sizes.push_back(genDimSizeCall(rewriter, op, enc, src, i));
else		else
sizes.push_back(constantIndex(rewriter, op->getLoc(), shape[i]));		sizes.push_back(constantIndex(rewriter, op->getLoc(), shape[i]));
}		}

/// Generates a temporary buffer of the given size and type.		/// Generates an uninitialized temporary buffer of the given size and
static Value genAlloca(ConversionPatternRewriter &rewriter, Location loc,		/// type, but return it at type `memref<? x $tp>` (rather than at type
		aartbikUnsubmitted Done Reply Inline Actions returns (to be consistent with Generates) also "as" instead of "at"? (2x) aartbik: returns (to be consistent with Generates) also "as" instead of "at"? (2x)
		/// `memref<$sz x $tp>`).
		inline static Value genAlloca(ConversionPatternRewriter &rewriter, Location loc,
		aartbikUnsubmitted Done Reply Inline Actions is the "inline" really important (here and below) internal linkage gives the compiler sufficient room to make that decision using built-in heuristics? aartbik: is the "inline" really important (here and below) internal linkage gives the compiler…
unsigned sz, Type tp) {		unsigned sz, Type tp) {
auto memTp = MemRefType::get({ShapedType::kDynamicSize}, tp);		auto memTp = MemRefType::get({ShapedType::kDynamicSize}, tp);
Value a = constantIndex(rewriter, loc, sz);		Value a = constantIndex(rewriter, loc, sz);
return rewriter.create<memref::AllocaOp>(loc, memTp, ValueRange{a});		return rewriter.create<memref::AllocaOp>(loc, memTp, ValueRange{a});
}		}

		/// Generates an uninitialized temporary buffer with room for one value
		/// of the given type, and return the `memref<$tp>`.
		inline static Value genAllocaScalar(ConversionPatternRewriter &rewriter,
		Location loc, Type tp) {
		return rewriter.create<memref::AllocaOp>(loc, MemRefType::get({}, tp));
		}

/// Generates a temporary buffer of the given type and given contents.		/// Generates a temporary buffer of the given type and given contents.
static Value genBuffer(ConversionPatternRewriter &rewriter, Location loc,		static Value genBuffer(ConversionPatternRewriter &rewriter, Location loc,
ArrayRef<Value> values) {		ArrayRef<Value> values) {
unsigned sz = values.size();		unsigned sz = values.size();
assert(sz >= 1);		assert(sz >= 1);
Value buffer = genAlloca(rewriter, loc, sz, values[0].getType());		Value buffer = genAlloca(rewriter, loc, sz, values[0].getType());
for (unsigned i = 0; i < sz; i++) {		for (unsigned i = 0; i < sz; i++) {
Value idx = constantIndex(rewriter, loc, i);		Value idx = constantIndex(rewriter, loc, i);
▲ Show 20 Lines • Show All 119 Lines • ▼ Show 20 Lines	static void genAddEltCall(ConversionPatternRewriter &rewriter, Operation *op,
params.push_back(val);		params.push_back(val);
params.push_back(ind);		params.push_back(ind);
params.push_back(perm);		params.push_back(perm);
Type pTp = LLVM::LLVMPointerType::get(rewriter.getI8Type());		Type pTp = LLVM::LLVMPointerType::get(rewriter.getI8Type());
auto fn = getFunc(op, name, pTp, params, /emitCInterface=/true);		auto fn = getFunc(op, name, pTp, params, /emitCInterface=/true);
rewriter.create<CallOp>(loc, pTp, fn, params);		rewriter.create<CallOp>(loc, pTp, fn, params);
}		}

		/// Generates a call to `SparseTensorCOO<V>::Iterator::getNext()`.
		aartbikUnsubmitted Done Reply Inline Actions we typically do not show that much detail of the C++ lib in this file aartbik: we typically do not show that much detail of the C++ lib in this file
		/// To avoid needing to handle multiple outputs and avoid defining
		/// a bunch of new MLIR types for `Element<V>`, we instead have both
		/// `indices` and `elemPtr` serve as out-parameters and return a bool
		/// to indicate whether those out-parameters are filled or whether we
		/// have no more elements to iterate.
		///
		/// \param [in] iter A value of MLIR-type `!llvm.ptr<i8>` which we
		aartbikUnsubmitted Done Reply Inline Actions This is not a style we use in MLIR, and especially for an internally linked method, and compared to the surrounding methods, it feels a bit heavy aartbik: This is not a style we use in MLIR, and especially for an internally linked method, and…
		/// static-cast to C++-type `SparseTensorCOO<V>::Iterator*`.
		aartbikUnsubmitted Done Reply Inline Actions let's not document SparseUtils inside this file aartbik: let's not document SparseUtils inside this file
		/// \param [out] ind A value of `memref<?xindex>` type, where the dynamic
		/// size matches the iterator/sparse-tensor.
		/// \param [out] elemPtr A value of MLIR-type `memref<V>`.
		///
		/// \returns `i1` indicating whether `ind` and `elemPtr` were filled.
		static Value genGetNextCall(ConversionPatternRewriter &rewriter, Operation *op,
		Value iter, Value ind, Value elemPtr) {
		Location loc = op->getLoc();
		Type elemTp = elemPtr.getType().cast<ShapedType>().getElementType();
		StringRef name;
		if (elemTp.isF64())
		name = "getNextF64";
		else if (elemTp.isF32())
		name = "getNextF32";
		else if (elemTp.isInteger(64))
		name = "getNextI64";
		else if (elemTp.isInteger(32))
		name = "getNextI32";
		else if (elemTp.isInteger(16))
		name = "getNextI16";
		else if (elemTp.isInteger(8))
		name = "getNextI8";
		else
		llvm_unreachable("Unknown element type");
		SmallVector<Value, 3> params;
		params.push_back(iter);
		params.push_back(ind);
		params.push_back(elemPtr);
		Type i1 = rewriter.getI1Type();
		auto fn = getFunc(op, name, i1, params, /emitCInterface=/true);
		auto call = rewriter.create<CallOp>(loc, i1, fn, params);
		return call.getResult(0);
		}

/// If the tensor is a sparse constant, generates and returns the pair of		/// If the tensor is a sparse constant, generates and returns the pair of
/// the constants for the indices and the values.		/// the constants for the indices and the values.
static Optional<std::pair<Value, Value>>		static Optional<std::pair<Value, Value>>
genSplitSparseConstant(ConversionPatternRewriter &rewriter, Location loc,		genSplitSparseConstant(ConversionPatternRewriter &rewriter, Location loc,
Value tensor) {		Value tensor) {
if (auto constOp = tensor.getDefiningOp<arith::ConstantOp>()) {		if (auto constOp = tensor.getDefiningOp<arith::ConstantOp>()) {
if (auto attr = constOp.getValue().dyn_cast<SparseElementsAttr>()) {		if (auto attr = constOp.getValue().dyn_cast<SparseElementsAttr>()) {
DenseElementsAttr indicesAttr = attr.getIndices();		DenseElementsAttr indicesAttr = attr.getIndices();
Show All 18 Lines	Value val = rewriter.create<tensor::ExtractOp>(loc, indices,
ValueRange{ivs[0], idx});		ValueRange{ivs[0], idx});
val =		val =
rewriter.create<arith::IndexCastOp>(loc, val, rewriter.getIndexType());		rewriter.create<arith::IndexCastOp>(loc, val, rewriter.getIndexType());
rewriter.create<memref::StoreOp>(loc, val, ind, idx);		rewriter.create<memref::StoreOp>(loc, val, ind, idx);
}		}
return rewriter.create<tensor::ExtractOp>(loc, values, ivs[0]);		return rewriter.create<tensor::ExtractOp>(loc, values, ivs[0]);
}		}

		/// Generates code to allocate a tensor of the given type, and zero
		/// initialize it. This function assumes the TensorType is fully
		/// specified (i.e., has static rank and sizes).
		// TODO(wrengr): support dynamic sizes.
		aartbikUnsubmitted Done Reply Inline Actions not in this revision yet? seems a relatively minor addition (and then we are done for all cases!) aartbik: not in this revision yet? seems a relatively minor addition (and then we are done for all cases!
		wrengrAuthorUnsubmitted Done Reply Inline Actions It's done in D112674. wrengr: It's done in D112674.
		static Value allocDenseTensor(ConversionPatternRewriter &rewriter, Location loc,
		RankedTensorType tensorTp) {
		Type elemTp = tensorTp.getElementType();
		auto memTp = MemRefType::get(tensorTp.getShape(), elemTp);
		Value mem = rewriter.create<memref::AllocOp>(loc, memTp);
		Value zero = constantZero(rewriter, loc, elemTp);
		rewriter.create<linalg::FillOp>(loc, zero, mem).result();
		return mem;
		}

		/// Insert the element returned by genGetNextCall() into the tensor
		aartbikUnsubmitted Done Reply Inline Actions Inserts (I usually use the s-form of a verb in the top level comment, but the imperative form in the inlined code) aartbik: Inserts (I usually use the s-form of a verb in the top level comment, but the imperative form…
		/// created by allocDenseTensor().
		///
		/// \param elemPtr The `memref<V>` filled by genGetNextCall().
		aartbikUnsubmitted Done Reply Inline Actions same aartbik: same
		/// \param tensor The `memref<... x V>` returned by allocDenseTensor().
		/// \param rank The rank of the `tensor`, and length of `ind`.
		/// \param ind The `memref<?xindex>` filled by genGetNextCall().
		static void insertScalarIntoDenseTensor(ConversionPatternRewriter &rewriter,
		Location loc, Value elemPtr,
		aartbikUnsubmitted Done Reply Inline Actions is this comment still relevant? I think we can safely remove it? aartbik: is this comment still relevant? I think we can safely remove it?
		Value tensor, unsigned rank,
		Value ind) {
		// Can't pass `Value ind` directly to memref::LoadOp::build();
		// instead must explicitly convert it into a ValueRange `ivs`.
		SmallVector<Value, 4> ivs;
		ivs.reserve(rank);
		for (unsigned i = 0; i < rank; i++) {
		Value idx = constantIndex(rewriter, loc, i);
		ivs.push_back(rewriter.create<memref::LoadOp>(loc, ind, idx));
		}
		Value elemV = rewriter.create<memref::LoadOp>(loc, elemPtr);
		rewriter.create<memref::StoreOp>(loc, elemV, tensor, ivs);
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Conversion rules.		// Conversion rules.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Sparse conversion rule for returns.		/// Sparse conversion rule for returns.
class SparseReturnConverter : public OpConversionPattern<ReturnOp> {		class SparseReturnConverter : public OpConversionPattern<ReturnOp> {
public:		public:
using OpConversionPattern::OpConversionPattern;		using OpConversionPattern::OpConversionPattern;
▲ Show 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	if (encDst && encSrc) {
src);		src);
newParams(rewriter, params, op, encDst, kToCOO, sizes, src);		newParams(rewriter, params, op, encDst, kToCOO, sizes, src);
Value coo = genNewCall(rewriter, op, params);		Value coo = genNewCall(rewriter, op, params);
params[6] = constantI32(rewriter, loc, kFromCOO);		params[6] = constantI32(rewriter, loc, kFromCOO);
params[7] = coo;		params[7] = coo;
rewriter.replaceOp(op, genNewCall(rewriter, op, params));		rewriter.replaceOp(op, genNewCall(rewriter, op, params));
return success();		return success();
}		}
if (!encDst \|\| encSrc) {		if (!encDst && encSrc) {
// TODO: sparse => dense		// This is sparse => dense conversion, which is handled as follows:
		// dst = new Tensor(0);
		// iter = src->toCOO()->getIterator();
		// while (elem = iter->getNext()) {
		aartbikUnsubmitted Done Reply Inline Actions using the "zero" data value itself has a bit of a risk that a sparse data structure with a few explicitly stored zeros will bail this loop too soon; having a 0/1 result and passing the value as another ref parameter seems safer aartbik: using the "zero" data value itself has a bit of a risk that a sparse data structure with a few…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Gross; but yes, you're right. I think I'll leave the commentary here as-is, however, since —like the field accesses on line 452, and like the commentary for other branches— it's reflecting more what the C++ code in the ExecutionEngine does, rather than reflecting what the generated MLIR code does; and there's no ambiguity about what the `getNext()` method returns, even though `IMPL_COO_GETNEXT` introduces the problem you mention. wrengr: Gross; but yes, you're right. I think I'll leave the commentary here as-is, however, since…
		// dst[elem.indices] = elem.value;
		// }
		// While it would be more efficient to inline the iterator logic
		// directly rather than allocating an object and calling methods,
		// this is good enough for now.
		aartbikUnsubmitted Done Reply Inline Actions remove this; it more or less applies to all method calls here, and it will fall under the general umbrella of perhaps moving to 100% codegen over support lib usage.... aartbik: remove this; it more or less applies to all method calls here, and it will fall under the…
		Location loc = op->getLoc();
		RankedTensorType tensorTp = resType.dyn_cast<RankedTensorType>();
		if (!tensorTp) {
		op.emitError() << "Result type is not a RankedTensorType";
		aartbikUnsubmitted Done Reply Inline Actions we don't emit errors when rewriting rules fail to apply aartbik: we don't emit errors when rewriting rules fail to apply
		return failure();
		}
		unsigned rank = tensorTp.getRank();
		Type elemTp = tensorTp.getElementType();
		Value dst = allocDenseTensor(rewriter, loc, tensorTp);
		Value ind = genAlloca(rewriter, loc, rank, rewriter.getIndexType());
		Value elemPtr = genAllocaScalar(rewriter, loc, elemTp);
		Value iter;
		{
		aartbikUnsubmitted Done Reply Inline Actions all this block scoping makes the code more lengthy than it could be I would either break this up in methods where it makes sense, or otherwise just take the scoping hit for readability aartbik: all this block scoping makes the code more lengthy than it could be I would either break this…
		// Clone encSrc but removing the dimOrdering.
		// The srcDimOrdering will already be applied during the
		// conversion from `SparseTensorStorage src` to SparseTensorCOO
		// (before that COO is converted to an iterator); so we don't
		aartbikUnsubmitted Done Reply Inline Actions how about removing this comment block and simply using // Setup a synthetic all-dense, no-permutation encoding for the dense destination. encDst = SparseTensorEncodingAttr::get( op->getContext(), SmallVector<SparseTensorEncodingAttr::DimLevelType>( rank, SparseTensorEncodingAttr::DimLevelType::Dense), AffineMap(), 0, 0); we don't need anything copied from src here. aartbik: how about removing this comment block and simply using // Setup a synthetic all-dense, no…
		aartbikUnsubmitted Done Reply Inline Actions this last one may have crossed. I just feel L601-612 takes too much real estate explaining what could be a onliner aartbik: this last one may have crossed. I just feel L601-612 takes too much real estate explaining what…
		wrengrAuthorUnsubmitted Done Reply Inline Actions We need the (ptrTp,indTp,valTp) values for _mlir_ciface_newSparseTensor to enter the right CASE. But I can abbreviate the commentary wrengr: We need the (ptrTp,indTp,valTp) values for _mlir_ciface_newSparseTensor to enter the right CASE.
		aartbikUnsubmitted Done Reply Inline Actions Oh yeah, right you are. Okay, then indeed just a bit less comment, but same code ;-) aartbik: Oh yeah, right you are. Okay, then indeed just a bit less comment, but same code ;-)
		// want newParams() to apply it a second time.
		//
		// The dimLevelType is only actually used by the actions which
		// return SparseTensorStorage (namely: kEmpty, kFromFile, and
		aartbikUnsubmitted Done Reply Inline Actions you cannot just pass "indices" to next call and use it here; you will need the IR to load the contents returned through the memref by the getNext() call, using explicit memref::LoadOp ops on the elements in the memref and passing this to the StoreOp aartbik: you cannot just pass "indices" to next call and use it here; you will need the IR to load the…
		// kFromCOO); so since we are using kToIter, the only operational
		// requirement is that it has the right length. Since the dst
		// is a dense tensor, we choose to set dimLevelType to all-dense
		aartbikUnsubmitted Done Reply Inline Actions You are replacing the op (which returns a tensor) with the result of an alloc (which is a memref). That type mismatch will fail. You need a buffer cast in between. aartbik: You are replacing the op (which returns a tensor) with the result of an alloc (which is a…
		aartbikUnsubmitted Done Reply Inline Actions This replacement will also need some legality check changes. Up to now, we were replacing sparse tensors with opague pointers, and the checks/rewriting did all the work But now we have dense_tensor = convert .... return dense_tenor and the mechanism will need some "love" to make it accept the rewriting even though the types were already legal to start with aartbik: This replacement will also need some legality check changes. Up to now, we were replacing…
		// for semantic correctness.
		encDst = SparseTensorEncodingAttr::get(
		op->getContext(),
		SmallVector<SparseTensorEncodingAttr::DimLevelType>(
		rank, SparseTensorEncodingAttr::DimLevelType::Dense),
		AffineMap(), encSrc.getPointerBitWidth(),
		encSrc.getIndexBitWidth());
		SmallVector<Value, 4> sizes;
		aartbikUnsubmitted Done Reply Inline Actions you still have some block scoping left? none of these really release stuff early to keep memory lower so I would opt for readability over block scoping aartbik: you still have some block scoping left? none of these really release stuff early to keep memory…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Imo the blocks improve readability rather than detract. wrengr: Imo the blocks improve readability rather than detract.
		aartbikUnsubmitted Done Reply Inline Actions Ok, acceptable ;-) aartbik: Ok, acceptable ;-)
		SmallVector<Value, 8> params;
		// TODO(wrengr): support dynamic sizes.
		aartbikUnsubmitted Done Reply Inline Actions shouldn't we bail out if there are dynamic sizes? or, better yet, just add those in this revision aartbik: shouldn't we bail out if there are dynamic sizes? or, better yet, just add those in this…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Done in D112674 wrengr: Done in D112674
		sizesFromType(rewriter, sizes, loc, tensorTp);
		newParams(rewriter, params, op, encDst, kToIter, sizes, src);
		iter = genNewCall(rewriter, op, params);
		}
		SmallVector<Value> noArgs;
		SmallVector<Type> noTypes;
		auto whileOp = rewriter.create<scf::WhileOp>(loc, noTypes, noArgs);
		{
		Block *before = rewriter.createBlock(&whileOp.before(), {}, noTypes);
		rewriter.setInsertionPointToEnd(before);
		Value cond = genGetNextCall(rewriter, op, iter, ind, elemPtr);
		rewriter.create<scf::ConditionOp>(loc, cond, before->getArguments());
		}
		{
		Block *after = rewriter.createBlock(&whileOp.after(), {}, noTypes);
		rewriter.setInsertionPointToStart(after);
		insertScalarIntoDenseTensor(rewriter, loc, elemPtr, dst, rank, ind);
		rewriter.create<scf::YieldOp>(loc);
		}
		rewriter.setInsertionPointAfter(whileOp);
		rewriter.replaceOpWithNewOp<memref::TensorLoadOp>(op, resType, dst);
		return success();
		}
		if (!encDst && !encSrc) {
		// dense => dense
return failure();		return failure();
		aartbikUnsubmitted Done Reply Inline Actions Technically, we also fall into this branch for the strange dense to dense conversion. We will have typically folded those away, but I would not completely rely on this having taken place always and defend against that in the code. aartbik: Technically, we also fall into this branch for the strange dense to dense conversion. We will…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Just to be clear, that should only ever happen when the source is a "sparse" tensor that happens to use dense storage for all dimensions, right? Which is to say, the destination is always a bog-standard dense tensor, right? If so then the code should still work; if not then we'll have to figure out how to tighten up the guards for detecting the different cases. I'll update the commentary (on the assumption that the answer to the first question is yes; of course, if that's true, then shouldn't the "sparse=>sparse" case have the same caveat?) wrengr: Just to be clear, that should only ever happen when the source is a "sparse" tensor that…
		aartbikUnsubmitted Done Reply Inline Actions No, I really means no encoding at all. There is a subtle difference between an un-annotated tensor and a tensor with all-dense annotations. All conversions work for the all-dense annotated case (it is treated as a sort of sparse tensors). But the logic on falling into a branch based on encDst, !encDest, srcDest, !srcDest (so four truth values) fell into this branch for two cases, but you only implemented the sparse->dense. So you will have to add one if-test and return failure. aartbik: No, I really means no encoding at all. There is a subtle difference between an un-annotated…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Got it wrengr: Got it
}		}
// This is a dense => sparse conversion or a sparse constant in COO =>		// This is a dense => sparse conversion or a sparse constant in COO =>
// sparse conversion, which is handled as follows:		// sparse conversion, which is handled as follows:
// t = newSparseCOO()		// t = newSparseCOO()
// ...code to fill the COO tensor t...		// ...code to fill the COO tensor t...
// s = newSparseTensor(t)		// s = newSparseTensor(t)
//		//
// To fill the COO tensor from a dense tensor:		// To fill the COO tensor from a dense tensor:
		aartbikUnsubmitted Done Reply Inline Actions Note that in the context of another project, we may migrate library code to actual codegen (which has the advantage of a smaller memory footprint potentially and allows for "unforeseen" type combinations). Such a migration may take care of all such performance concerns. aartbik: Note that in the context of another project, we may migrate library code to actual codegen…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Yep, that's what I was thinking re "good enough for now" :) I mainly added the comment since when discussing the design with Tatiana, she raised some concerns about the performance implications of the method/function calls. For our current goals, I can't imagine this branch would be taken often enough to constitute a performance bottleneck. (And whenever it does, it's easy enough to fix at that point.) wrengr: Yep, that's what I was thinking re "good enough for now" :) I mainly added the comment since…
// for i1 in dim1		// for i1 in dim1
// ..		// ..
// for ik in dimk		// for ik in dimk
// val = a[i1,..,ik]		// val = a[i1,..,ik]
// if val != 0		// if val != 0
// t->add(val, [i1,..,ik], [p1,..,pk])		// t->add(val, [i1,..,ik], [p1,..,pk])
		aartbikUnsubmitted Done Reply Inline Actions I think we need a subtle rewriting of the new call utility since we need the sparse encoding for some info, but the id permutation for the "new" tensor (ie. the dense result) aartbik: I think we need a subtle rewriting of the new call utility since we need the sparse encoding…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Yeah, I've been mulling over a few ways to clean genNewCall up. Did you want me to do that in/before this differential, or is it okay to do afterwards? wrengr: Yeah, I've been mulling over a few ways to clean genNewCall up. Did you want me to do that…
		bixiaUnsubmitted Done Reply Inline Actions this should be encSrc, not encDst, right? bixia: this should be encSrc, not encDst, right?
		wrengrAuthorUnsubmitted Done Reply Inline Actions Good catch :) That lead to an utterly inscrutable crash stacktrace wrengr: Good catch :) That lead to an utterly inscrutable crash stacktrace
		wrengrAuthorUnsubmitted Done Reply Inline Actions Ha, actually I was right(-ish) the first time. We want the destination encoding here, so we don't apply the dimOrdering permutation twice. The problem was that I was passing a nullptr rather than explicitly constructing the SparseTensorEncodingAttr wrengr: Ha, actually I was right(-ish) the first time. We want the destination encoding here, so we…
//		//
// To fill the COO tensor from a sparse constant in COO format:		// To fill the COO tensor from a sparse constant in COO format:
// for i in range(NNZ)		// for i in range(NNZ)
// val = values[i]		// val = values[i]
// [i1,..,ik] = indices[i]		// [i1,..,ik] = indices[i]
// t->add(val, [i1,..,ik], [p1,..,pk])		// t->add(val, [i1,..,ik], [p1,..,pk])
//		//
// Note that the dense tensor traversal code is actually implemented		// Note that the dense tensor traversal code is actually implemented
▲ Show 20 Lines • Show All 212 Lines • Show Last 20 Lines

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorPasses.cpp

Show First 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	target.addDynamicallyLegalOp<tensor::DimOp>([&](tensor::DimOp op) {
return converter.isLegal(op.getOperandTypes());		return converter.isLegal(op.getOperandTypes());
});		});
target.addDynamicallyLegalOp<tensor::CastOp>([&](tensor::CastOp op) {		target.addDynamicallyLegalOp<tensor::CastOp>([&](tensor::CastOp op) {
return converter.isLegal(op.getOperand().getType());		return converter.isLegal(op.getOperand().getType());
});		});
// The following operations and dialects may be introduced by the		// The following operations and dialects may be introduced by the
// rewriting rules, and are therefore marked as legal.		// rewriting rules, and are therefore marked as legal.
target.addLegalOp<arith::CmpFOp, arith::CmpIOp, arith::ConstantOp,		target.addLegalOp<arith::CmpFOp, arith::CmpIOp, arith::ConstantOp,
arith::IndexCastOp, tensor::ExtractOp>();		arith::IndexCastOp, linalg::FillOp, linalg::YieldOp,
		tensor::ExtractOp>();
target.addLegalDialect<LLVM::LLVMDialect, memref::MemRefDialect,		target.addLegalDialect<LLVM::LLVMDialect, memref::MemRefDialect,
scf::SCFDialect>();		scf::SCFDialect>();
// Populate with rules and apply rewriting rules.		// Populate with rules and apply rewriting rules.
populateFuncOpTypeConversionPattern(patterns, converter);		populateFuncOpTypeConversionPattern(patterns, converter);
populateCallOpTypeConversionPattern(patterns, converter);		populateCallOpTypeConversionPattern(patterns, converter);
populateSparseTensorConversionPatterns(converter, patterns);		populateSparseTensorConversionPatterns(converter, patterns);
if (failed(applyPartialConversion(getOperation(), target,		if (failed(applyPartialConversion(getOperation(), target,
std::move(patterns))))		std::move(patterns))))
Show All 13 Lines

mlir/lib/ExecutionEngine/SparseUtils.cpp

Show First 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
public:		public:
SparseTensorCOO(const std::vector<uint64_t> &szs, uint64_t capacity)		SparseTensorCOO(const std::vector<uint64_t> &szs, uint64_t capacity)
: sizes(szs) {		: sizes(szs) {
if (capacity)		if (capacity)
elements.reserve(capacity);		elements.reserve(capacity);
}		}
/// Adds element as indices and value.		/// Adds element as indices and value.
void add(const std::vector<uint64_t> &ind, V val) {		void add(const std::vector<uint64_t> &ind, V val) {
uint64_t rank = getRank();		uint64_t rank = getRank();
		aartbikUnsubmitted Done Reply Inline Actions yeah, this is awesome! aartbik: yeah, this is awesome!
assert(rank == ind.size());		assert(rank == ind.size());
for (uint64_t r = 0; r < rank; r++)		for (uint64_t r = 0; r < rank; r++)
assert(ind[r] < sizes[r]); // within bounds		assert(ind[r] < sizes[r]); // within bounds
elements.emplace_back(ind, val);		elements.emplace_back(ind, val);
}		}
/// Sorts elements lexicographically by index.		/// Sorts elements lexicographically by index.
void sort() { std::sort(elements.begin(), elements.end(), lexOrder); }		void sort() { std::sort(elements.begin(), elements.end(), lexOrder); }
/// Returns rank.		/// Returns rank.
uint64_t getRank() const { return sizes.size(); }		uint64_t getRank() const { return sizes.size(); }
/// Getter for sizes array.		/// Getter for sizes array.
const std::vector<uint64_t> &getSizes() const { return sizes; }		const std::vector<uint64_t> &getSizes() const { return sizes; }
/// Getter for elements array.		/// Getter for elements array.
const std::vector<Element<V>> &getElements() const { return elements; }		const std::vector<Element<V>> &getElements() const { return elements; }

		// Forward declaration of the class required by getIterator. We make
		// it a nested class so that it can access the private fields.
		bixiaUnsubmitted Done Reply Inline Actions nested bixia: nested
		class Iterator;
		/// Returns an iterator over the elements of a SparseTensorCOO.
		Iterator getIterator() const { return new Iterator(this); }

/// Factory method. Permutes the original dimensions according to		/// Factory method. Permutes the original dimensions according to
/// the given ordering and expects subsequent add() calls to honor		/// the given ordering and expects subsequent add() calls to honor
/// that same ordering for the given indices. The result is a		/// that same ordering for the given indices. The result is a
/// fully permuted coordinate scheme.		/// fully permuted coordinate scheme.
static SparseTensorCOO<V> *newSparseTensorCOO(uint64_t rank,		static SparseTensorCOO<V> *newSparseTensorCOO(uint64_t rank,
		aartbikUnsubmitted Done Reply Inline Actions iteratorLocked = false? aartbik: iteratorLocked = false?
const uint64_t *sizes,		const uint64_t *sizes,
const uint64_t *perm,		const uint64_t *perm,
uint64_t capacity = 0) {		uint64_t capacity = 0) {
std::vector<uint64_t> permsz(rank);		std::vector<uint64_t> permsz(rank);
for (uint64_t r = 0; r < rank; r++)		for (uint64_t r = 0; r < rank; r++)
permsz[perm[r]] = sizes[r];		permsz[perm[r]] = sizes[r];
return new SparseTensorCOO<V>(permsz, capacity);		return new SparseTensorCOO<V>(permsz, capacity);
}		}

private:		private:
/// Returns true if indices of e1 < indices of e2.		/// Returns true if indices of e1 < indices of e2.
static bool lexOrder(const Element<V> &e1, const Element<V> &e2) {		static bool lexOrder(const Element<V> &e1, const Element<V> &e2) {
uint64_t rank = e1.indices.size();		uint64_t rank = e1.indices.size();
assert(rank == e2.indices.size());		assert(rank == e2.indices.size());
for (uint64_t r = 0; r < rank; r++) {		for (uint64_t r = 0; r < rank; r++) {
if (e1.indices[r] == e2.indices[r])		if (e1.indices[r] == e2.indices[r])
continue;		continue;
return e1.indices[r] < e2.indices[r];		return e1.indices[r] < e2.indices[r];
}		}
return false;		return false;
}		}
std::vector<uint64_t> sizes; // per-dimension sizes		const std::vector<uint64_t> sizes; // per-dimension sizes
std::vector<Element<V>> elements;		std::vector<Element<V>> elements;
};		};

		/// An iterator over the elements of a sparse tensor in coordinate-scheme
		/// format. This iterator is not designed for use by C++ code itself,
		aartbikUnsubmitted Done Reply Inline Actions this sentence does not flow quite well, or am I reading it wrong? aartbik: this sentence does not flow quite well, or am I reading it wrong?
		wrengrAuthorUnsubmitted Done Reply Inline Actions I'll try rewording wrengr: I'll try rewording
		/// but rather by generated MLIR code (i.e., by calls to `IMPL_COO_GETNEXT`);
		/// hence why it may look idisyncratic or unconventional compared to
		/// conventional C++ iterators.
		template <typename V>
		class SparseTensorCOO<V>::Iterator {
		aartbikUnsubmitted Done Reply Inline Actions this class adds an enormous amount of code for a very thin iterator how about just having the very simple start()/next() iteration inside the COO class itself? That way, we can also assert an error if you try to insert while iterating aartbik: this class adds an enormous amount of code for a very thin iterator how about just having the…
		// TODO(wrengr): really this class should be a thin wrapper/subclass
		// of the std::vector, rather than needing to do a dereference every
		// time a method is called; but we don't want to actually copy the whole
		// contents of the underlying array(s) when this class is initialized.
		// Maybe we should be a thin wrapper/subclass of SparseTensorCOO?
		// Or have a variant of SparseTensorStorage::toCOO() to construct this
		// iterator directly?
		const std::vector<Element<V>> &elements;
		unsigned pos;

		public:
		// TODO(wrengr): to guarantee safety we'd either need to consume the
		// SparseTensorCOO (e.g., requiring an rvalue-reference) or get notified
		// somehow whenever the SparseTensorCOO adds new elements, sorts, etc.
		Iterator(const SparseTensorCOO<V> &coo) : elements(coo.elements), pos(0) {}

		const Element<V> *getNext() {
		if (pos < elements.size())
		return &(elements[pos++]);
		return nullptr;
		}
		};

/// Abstract base class of sparse tensor storage. Note that we use		/// Abstract base class of sparse tensor storage. Note that we use
/// function overloading to implement "partial" method specialization.		/// function overloading to implement "partial" method specialization.
class SparseTensorStorageBase {		class SparseTensorStorageBase {
public:		public:
enum DimLevelType : uint8_t { kDense = 0, kCompressed = 1, kSingleton = 2 };		enum DimLevelType : uint8_t { kDense = 0, kCompressed = 1, kSingleton = 2 };

virtual uint64_t getDimSize(uint64_t) = 0;		virtual uint64_t getDimSize(uint64_t) = 0;

▲ Show 20 Lines • Show All 385 Lines • ▼ Show 20 Lines	enum PrimaryTypeEnum : uint32_t {
kI8 = 6		kI8 = 6
};		};

enum Action : uint32_t {		enum Action : uint32_t {
kEmpty = 0,		kEmpty = 0,
kFromFile = 1,		kFromFile = 1,
kFromCOO = 2,		kFromCOO = 2,
kEmptyCOO = 3,		kEmptyCOO = 3,
kToCOO = 4		kToCOO = 4,
		kToIter = 5
};		};

#define CASE(p, i, v, P, I, V) \		#define CASE(p, i, v, P, I, V) \
if (ptrTp == (p) && indTp == (i) && valTp == (v)) { \		if (ptrTp == (p) && indTp == (i) && valTp == (v)) { \
SparseTensorCOO<V> *tensor = nullptr; \		SparseTensorCOO<V> *tensor = nullptr; \
if (action == kFromFile) \		if (action <= kFromCOO) { \
tensor = \		if (action == kFromFile) { \
openSparseTensorCOO<V>(static_cast<char *>(ptr), rank, sizes, perm); \		char filename = static_cast<char >(ptr); \
else if (action == kFromCOO) \		tensor = openSparseTensorCOO<V>(filename, rank, sizes, perm); \
		} else if (action == kFromCOO) \
		aartbikUnsubmitted Done Reply Inline Actions assert(action == kEmpty); disappeared in your rewrite aartbik: assert(action == kEmpty); disappeared in your rewrite
		aartbikUnsubmitted Done Reply Inline Actions this is of course inside a macro now, but LLVM wants braces {} on all branches if one of them is branched aartbik: this is of course inside a macro now, but LLVM wants braces {} on all branches if one of them…
tensor = static_cast<SparseTensorCOO<V> *>(ptr); \		tensor = static_cast<SparseTensorCOO<V> *>(ptr); \
else if (action == kEmptyCOO) \
return SparseTensorCOO<V>::newSparseTensorCOO(rank, sizes, perm); \
else if (action == kToCOO) \
return static_cast<SparseTensorStorage<P, I, V> *>(ptr)->toCOO(perm); \
else \
assert(action == kEmpty); \
return SparseTensorStorage<P, I, V>::newSparseTensor(rank, sizes, perm, \		return SparseTensorStorage<P, I, V>::newSparseTensor(rank, sizes, perm, \
sparsity, tensor); \		sparsity, tensor); \
		} else if (action == kEmptyCOO) \
		return SparseTensorCOO<V>::newSparseTensorCOO(rank, sizes, perm); \
		else { \
		tensor = static_cast<SparseTensorStorage<P, I, V> *>(ptr)->toCOO(perm); \
		if (action == kToCOO) \
		return tensor; \
		else { \
		assert(action == kToIter); \
		return tensor->getIterator(); \
		} \
		} \
}		}

#define IMPL1(NAME, TYPE, LIB) \		#define IMPL1(NAME, TYPE, LIB) \
void _mlir_ciface_##NAME(StridedMemRefType<TYPE, 1> ref, void tensor) { \		void _mlir_ciface_##NAME(StridedMemRefType<TYPE, 1> ref, void tensor) { \
assert(ref); \		assert(ref); \
assert(tensor); \		assert(tensor); \
std::vector<TYPE> *v; \		std::vector<TYPE> *v; \
static_cast<SparseTensorStorageBase *>(tensor)->LIB(&v); \		static_cast<SparseTensorStorageBase *>(tensor)->LIB(&v); \
Show All 30 Lines	void _mlir_ciface_##NAME(void tensor, TYPE value, \
uint64_t isize = iref->sizes[0]; \		uint64_t isize = iref->sizes[0]; \
std::vector<index_t> indices(isize); \		std::vector<index_t> indices(isize); \
for (uint64_t r = 0; r < isize; r++) \		for (uint64_t r = 0; r < isize; r++) \
indices[perm[r]] = indx[r]; \		indices[perm[r]] = indx[r]; \
static_cast<SparseTensorCOO<TYPE> *>(tensor)->add(indices, value); \		static_cast<SparseTensorCOO<TYPE> *>(tensor)->add(indices, value); \
return tensor; \		return tensor; \
}		}

		/// Calls SparseTensorCOO<V>::Iterator::getNext() with the following semantics.
		/// To avoid needing to handle multiple outputs and avoid defining
		aartbikUnsubmitted Done Reply Inline Actions We really should not need permutation here. If you call toCOO with ID permutation, it restores the indices to original order. Note that toCOO takes the permutation of the target, and internally restores permutation, if there was one from source (the internally stored inverse permutation). So if you call toCOO with ID perm, you don't need the perm here anymore, since indices are in the natural "dense" order MLIR expects. aartbik: We really should not need permutation here. If you call toCOO with ID permutation, it restores…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Oh good. I thought that's how things worked, but I wasn't quite certain. wrengr: Oh good. I thought that's how things worked, but I wasn't quite certain.
		aartbikUnsubmitted Done Reply Inline Actions tensor is a bit more consistent name with IMPL3 aartbik: tensor is a bit more consistent name with IMPL3
		/// a bunch of new MLIR types for `Element<V>`, we instead have both
		/// `iref` and `value` serve as out-parameters and return a bool to
		/// indicate whether those out-parameters are filled or whether we have
		/// no more elements to iterate.
		#define IMPL_COO_GETNEXT(NAME, V) \
		bool _mlir_ciface_##NAME(void ptr, StridedMemRefType<uint64_t, 1> iref, \
		StridedMemRefType<V, 0> *vref) { \
		assert(iref->strides[0] == 1); \
		uint64_t *indx = iref->data + iref->offset; \
		aartbikUnsubmitted Done Reply Inline Actions oh, we will leak memory right now! When iteration is done, we should release the COO tensor/ The easiest way would be to do this if (elem == nullptr) { delete iter; return false; } and document that tensor can no longer be used once getNext returns false aartbik: oh, we will leak memory right now! When iteration is done, we should release the COO tensor/…
		aartbikUnsubmitted Done Reply Inline Actions this is the most important issue, since without this fix, asan will break the test aartbik: this is the most important issue, since without this fix, asan will break the test
		V *value = vref->data + vref->offset; \
		const uint64_t isize = iref->sizes[0]; \
		auto iter = static_cast<SparseTensorCOO<V>::Iterator *>(ptr); \
		const Element<V> *elem = iter->getNext(); \
		if (elem == nullptr) \
		return false; \
		for (uint64_t r = 0; r < isize; r++) \
		indx[r] = elem->indices[r]; \
		*value = elem->value; \
		return true; \
		}

/// Constructs a new sparse tensor. This is the "swiss army knife"		/// Constructs a new sparse tensor. This is the "swiss army knife"
/// method for materializing sparse tensors into the computation.		/// method for materializing sparse tensors into the computation.
///		///
/// action:		/// action:
/// kEmpty = returns empty storage to fill later		/// kEmpty = returns empty storage to fill later
/// kFromFile = returns storage, where ptr contains filename to read		/// kFromFile = returns storage, where ptr contains filename to read
/// kFromCOO = returns storage, where ptr contains coordinate scheme to assign		/// kFromCOO = returns storage, where ptr contains coordinate scheme to assign
/// kEmptyCOO = returns empty coordinate scheme to fill and use with kFromCOO		/// kEmptyCOO = returns empty coordinate scheme to fill and use with kFromCOO
/// kToCOO = returns coordinate scheme from storage in ptr to use with kFromCOO		/// kToCOO = returns coordinate scheme from storage in ptr to use with kFromCOO
		/// kToIter = returns iterator from storage in ptr (call IMPL_COO_GETNEXT to
		aartbikUnsubmitted Done Reply Inline Actions if you use a shorter name, perhaps it fits on one line aartbik: if you use a shorter name, perhaps it fits on one line
		/// use)
void *		void *
_mlir_ciface_newSparseTensor(StridedMemRefType<uint8_t, 1> *aref, // NOLINT		_mlir_ciface_newSparseTensor(StridedMemRefType<uint8_t, 1> *aref, // NOLINT
StridedMemRefType<index_t, 1> *sref,		StridedMemRefType<index_t, 1> *sref,
StridedMemRefType<index_t, 1> *pref,		StridedMemRefType<index_t, 1> *pref,
uint32_t ptrTp, uint32_t indTp, uint32_t valTp,		uint32_t ptrTp, uint32_t indTp, uint32_t valTp,
uint32_t action, void *ptr) {		uint32_t action, void *ptr) {
assert(aref && sref && pref);		assert(aref && sref && pref);
assert(aref->strides[0] == 1 && sref->strides[0] == 1 &&		assert(aref->strides[0] == 1 && sref->strides[0] == 1 &&
▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines
/// Helper to add value to coordinate scheme, one per value type.		/// Helper to add value to coordinate scheme, one per value type.
IMPL3(addEltF64, double)		IMPL3(addEltF64, double)
IMPL3(addEltF32, float)		IMPL3(addEltF32, float)
IMPL3(addEltI64, int64_t)		IMPL3(addEltI64, int64_t)
IMPL3(addEltI32, int32_t)		IMPL3(addEltI32, int32_t)
IMPL3(addEltI16, int16_t)		IMPL3(addEltI16, int16_t)
IMPL3(addEltI8, int8_t)		IMPL3(addEltI8, int8_t)

		/// Helper to enumerate elements of coordinate scheme, one per value type.
		IMPL_COO_GETNEXT(getNextF64, double)
		IMPL_COO_GETNEXT(getNextF32, float)
		IMPL_COO_GETNEXT(getNextI64, int64_t)
		IMPL_COO_GETNEXT(getNextI32, int32_t)
		IMPL_COO_GETNEXT(getNextI16, int16_t)
		IMPL_COO_GETNEXT(getNextI8, int8_t)

#undef CASE		#undef CASE
#undef IMPL1		#undef IMPL1
		aartbikUnsubmitted Done Reply Inline Actions feel free to rename this into more intuitive names in a follow up revision too, btw aartbik: feel free to rename this into more intuitive names in a follow up revision too, btw
		wrengrAuthorUnsubmitted Done Reply Inline Actions Will do wrengr: Will do
#undef IMPL2		#undef IMPL2
#undef IMPL3		#undef IMPL3
		#undef IMPL_COO_GETNEXT

		// TODO(wrengr): Either make this function more robust/usable, or figure
		// out how to avoid it.
		bool printInqualityF64(int64_t c, int64_t i, int64_t j, int64_t k,
		double expected, double found) {
		if (expected == found)
		return false;
		fprintf(stdout, "%c[%ld,%ld,%ld] Expected: %lg; but found: %lg\n", (int)c, i,
		j, k, expected, found);
		return true;
		}
		aartbikUnsubmitted Done Reply Inline Actions I prefer not to have this function at all; but if we keep it, it should go to CRunnerUtils.cpp since we have some other printing utilities there but this is not related to sparse aartbik: I prefer not to have this function at all; but if we keep it, it should go to CRunnerUtils.cpp…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Yeah, I'd rather not have this function too. Unfortunately, when I tried using the CRunnerUtils.cpp functions that vector.print does, I could never get it to link right: it either complained about re-defining a symbol, or about using an un-defined symbol. As for implementing it in the integration test itself, I can't seem to find a way to define string values (not attributes) for passing to fputs() Moved to CRunnerUtils.cpp for now. Will try to rip it out in a future differential. wrengr: Yeah, I'd rather not have this function too. Unfortunately, when I tried using the CRunnerUtils.
		aartbikUnsubmitted Done Reply Inline Actions How about just leaving it at the exit(1) for not and think about this in a future differential. Rather than introducing something we want to rip again anyway? aartbik: How about just leaving it at the exit(1) for not and think about this in a future differential.

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// Public API with methods that accept C-style data structures to interact		// Public API with methods that accept C-style data structures to interact
// with sparse tensors, which are only visible as opaque pointers externally.		// with sparse tensors, which are only visible as opaque pointers externally.
// These methods can be used both by MLIR compiler-generated code as well as by		// These methods can be used both by MLIR compiler-generated code as well as by
// an external runtime that wants to interact with MLIR compiler-generated code.		// an external runtime that wants to interact with MLIR compiler-generated code.
//		//
▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

mlir/test/Dialect/SparseTensor/conversion_sparse2dense.mlir

This file was added.

				// RUN: mlir-opt %s --sparse-tensor-conversion --canonicalize --cse \| FileCheck %s

				#SparseVector = #sparse_tensor.encoding<{
				dimLevelType = ["compressed"]
				}>

				#SparseMatrix = #sparse_tensor.encoding<{
				dimLevelType = ["dense", "compressed"]
				}>

				#SparseTensor = #sparse_tensor.encoding<{
				dimLevelType = ["dense", "compressed", "compressed"],
				dimOrdering = affine_map<(i,j,k) -> (k,i,j)>
				}>

				// CHECK-LABEL: func @sparse_convert_1d(
				aartbikUnsubmitted Done Reply Inline Actions Ah, you made this much more accurate checking than conversion.mlir does (good). But, let's also add a check in that file in the more informal way there aartbik: Ah, you made this much more accurate checking than conversion.mlir does (good). But, let's also…
				wrengrAuthorUnsubmitted Done Reply Inline Actions I'm not sure I follow. I was just trying to do the same as conversion.mlir (but in the opposite direction, naturally); and elsewhere you suggested breaking things out into a separate test rather than reusing the ones already there. wrengr: I'm not sure I follow. I was just trying to do the same as conversion.mlir (but in the opposite…
				aartbikUnsubmitted Done Reply Inline Actions Ok, fair enough. I was suggesting to have one sparse->dense check in the conversion.mlir in the more concise style of that test, but having this more rigid tests is indeed sufficient. You can ignore the suggestion ;-) aartbik: Ok, fair enough. I was suggesting to have one sparse->dense check in the conversion.mlir in the…
				// CHECK-SAME: %[[Arg:.*]]: !llvm.ptr<i8>) -> tensor<13xi32>
				// CHECK-DAG: %[[I0:.*]] = arith.constant 0 : index
				// CHECK-DAG: %[[I13:.*]] = arith.constant 13 : index
				//
				// CHECK-DAG: %[[M:.*]] = memref.alloc() : memref<13xi32>
				// CHECK-DAG: %[[E0:.*]] = arith.constant 0 : i32
				// CHECK-DAG: linalg.fill(%[[E0]], %[[M]]) : i32, memref<13xi32>
				// CHECK-DAG: %[[IndS:.*]] = memref.alloca() : memref<1xindex>
				// CHECK-DAG: %[[IndD:.*]] = memref.cast %[[IndS]] : memref<1xindex> to memref<?xindex>
				// CHECK-DAG: %[[ElemBuffer:.*]] = memref.alloca() : memref<i32>
				//
				// CHECK-DAG: %[[AttrsS:.*]] = memref.alloca() : memref<1xi8>
				// CHECK-DAG: %[[AttrsD:.*]] = memref.cast %[[AttrsS]] : memref<1xi8> to memref<?xi8>
				// CHECK-DAG: %[[Attr0:.*]] = arith.constant 0 : i8
				// CHECK-DAG: memref.store %[[Attr0]], %[[AttrsS]][%[[I0]]] : memref<1xi8>
				//
				// CHECK-DAG: %[[SizesS:.*]] = memref.alloca() : memref<1xindex>
				// CHECK-DAG: %[[SizesD:.*]] = memref.cast %[[SizesS]] : memref<1xindex> to memref<?xindex>
				// CHECK-DAG: memref.store %[[I13]], %[[SizesS]][%[[I0]]] : memref<1xindex>
				//
				// CHECK-DAG: %[[PermS:.*]] = memref.alloca() : memref<1xindex>
				// CHECK-DAG: %[[PermD:.*]] = memref.cast %[[PermS]] : memref<1xindex> to memref<?xindex>
				// CHECK-DAG: memref.store %[[I0]], %[[PermS]][%[[I0]]] : memref<1xindex>
				//
				// CHECK-DAG: %[[SecTp:.*]] = arith.constant 1 : i32
				// CHECK-DAG: %[[ElemTp:.*]] = arith.constant 4 : i32
				// CHECK-DAG: %[[ActionToIter:.*]] = arith.constant 5 : i32
				// CHECK: %[[Iter:.*]] = call @newSparseTensor(%[[AttrsD]], %[[SizesD]], %[[PermD]], %[[SecTp]], %[[SecTp]], %[[ElemTp]], %[[ActionToIter]], %[[Arg]]) : (memref<?xi8>, memref<?xindex>, memref<?xindex>, i32, i32, i32, i32, !llvm.ptr<i8>) -> !llvm.ptr<i8>
				// CHECK: scf.while : () -> () {
				// CHECK: %[[Cond:.*]] = call @getNextI32(%[[Iter]], %[[IndD]], %[[ElemBuffer]]) : (!llvm.ptr<i8>, memref<?xindex>, memref<i32>) -> i1
				// CHECK: scf.condition(%[[Cond]])
				// CHECK: } do {
				// CHECK: %[[Iv0:.*]] = memref.load %[[IndS]][%[[I0]]] : memref<1xindex>
				// CHECK: %[[ElemVal:.*]] = memref.load %[[ElemBuffer]][] : memref<i32>
				// CHECK: memref.store %[[ElemVal]], %[[M]][%[[Iv0]]] : memref<13xi32>
				// CHECK: scf.yield
				// CHECK: }
				// CHECK: %[[T:.*]] = memref.tensor_load %[[M]] : memref<13xi32>
				// CHECK: return %[[T]] : tensor<13xi32>
				func @sparse_convert_1d(%arg0: tensor<13xi32, #SparseVector>) -> tensor<13xi32> {
				%0 = sparse_tensor.convert %arg0 : tensor<13xi32, #SparseVector> to tensor<13xi32>
				return %0 : tensor<13xi32>
				}

				// CHECK-LABEL: func @sparse_convert_2d(
				// CHECK-SAME: %[[Arg:.*]]: !llvm.ptr<i8>) -> tensor<2x4xf64>
				// CHECK-DAG: %[[I0:.*]] = arith.constant 0 : index
				// CHECK-DAG: %[[I1:.*]] = arith.constant 1 : index
				// CHECK-DAG: %[[I2:.*]] = arith.constant 2 : index
				// CHECK-DAG: %[[I4:.*]] = arith.constant 4 : index
				//
				// CHECK-DAG: %[[M:.*]] = memref.alloc() : memref<2x4xf64>
				// CHECK-DAG: %[[E0:.*]] = arith.constant 0.000000e+00 : f64
				// CHECK-DAG: linalg.fill(%[[E0]], %[[M]]) : f64, memref<2x4xf64>
				// CHECK-DAG: %[[IndS:.*]] = memref.alloca() : memref<2xindex>
				// CHECK-DAG: %[[IndD:.*]] = memref.cast %[[IndS]] : memref<2xindex> to memref<?xindex>
				// CHECK-DAG: %[[ElemBuffer:.*]] = memref.alloca() : memref<f64>
				//
				// CHECK-DAG: %[[AttrsS:.*]] = memref.alloca() : memref<2xi8>
				// CHECK-DAG: %[[AttrsD:.*]] = memref.cast %[[AttrsS]] : memref<2xi8> to memref<?xi8>
				// CHECK-DAG: %[[Attr0:.*]] = arith.constant 0 : i8
				// CHECK-DAG: memref.store %[[Attr0]], %[[AttrsS]][%[[I0]]] : memref<2xi8>
				// CHECK-DAG: memref.store %[[Attr0]], %[[AttrsS]][%[[I1]]] : memref<2xi8>
				//
				// CHECK-DAG: %[[SizesS:.*]] = memref.alloca() : memref<2xindex>
				// CHECK-DAG: %[[SizesD:.*]] = memref.cast %[[SizesS]] : memref<2xindex> to memref<?xindex>
				// CHECK-DAG: memref.store %[[I2]], %[[SizesS]][%[[I0]]] : memref<2xindex>
				// CHECK-DAG: memref.store %[[I4]], %[[SizesS]][%[[I1]]] : memref<2xindex>
				//
				// CHECK-DAG: %[[PermS:.*]] = memref.alloca() : memref<2xindex>
				// CHECK-DAG: %[[PermD:.*]] = memref.cast %[[PermS]] : memref<2xindex> to memref<?xindex>
				// CHECK-DAG: memref.store %[[I0]], %[[PermS]][%[[I0]]] : memref<2xindex>
				// CHECK-DAG: memref.store %[[I1]], %[[PermS]][%[[I1]]] : memref<2xindex>
				//
				// CHECK-DAG: %[[ActionToIter:.*]] = arith.constant 5 : i32
				// CHECK: %[[Iter:.]] = call @newSparseTensor(%[[AttrsD]], %[[SizesD]], %[[PermD]], %{{.}}, %{{.}}, %{{.}}, %[[ActionToIter]], %[[Arg]]) : (memref<?xi8>, memref<?xindex>, memref<?xindex>, i32, i32, i32, i32, !llvm.ptr<i8>) -> !llvm.ptr<i8>
				// CHECK: scf.while : () -> () {
				// CHECK: %[[Cond:.*]] = call @getNextF64(%[[Iter]], %[[IndD]], %[[ElemBuffer]]) : (!llvm.ptr<i8>, memref<?xindex>, memref<f64>) -> i1
				// CHECK: scf.condition(%[[Cond]])
				// CHECK: } do {
				// CHECK: %[[Iv0:.*]] = memref.load %[[IndS]][%[[I0]]] : memref<2xindex>
				// CHECK: %[[Iv1:.*]] = memref.load %[[IndS]][%[[I1]]] : memref<2xindex>
				// CHECK: %[[ElemVal:.*]] = memref.load %[[ElemBuffer]][] : memref<f64>
				// CHECK: memref.store %[[ElemVal]], %[[M]][%[[Iv0]], %[[Iv1]]] : memref<2x4xf64>
				// CHECK: scf.yield
				// CHECK: }
				// CHECK: %[[T:.*]] = memref.tensor_load %[[M]] : memref<2x4xf64>
				// CHECK: return %[[T]] : tensor<2x4xf64>
				func @sparse_convert_2d(%arg0: tensor<2x4xf64, #SparseMatrix>) -> tensor<2x4xf64> {
				%0 = sparse_tensor.convert %arg0 : tensor<2x4xf64, #SparseMatrix> to tensor<2x4xf64>
				return %0 : tensor<2x4xf64>
				}

				// CHECK-LABEL: func @sparse_convert_3d(
				// CHECK-SAME: %[[Arg:.*]]: !llvm.ptr<i8>) -> tensor<2x3x4xf64>
				// CHECK-DAG: %[[I0:.*]] = arith.constant 0 : index
				// CHECK-DAG: %[[I1:.*]] = arith.constant 1 : index
				// CHECK-DAG: %[[I2:.*]] = arith.constant 2 : index
				// CHECK-DAG: %[[I3:.*]] = arith.constant 3 : index
				// CHECK-DAG: %[[I4:.*]] = arith.constant 4 : index
				//
				// CHECK-DAG: %[[M:.*]] = memref.alloc() : memref<2x3x4xf64>
				// CHECK-DAG: %[[E0:.*]] = arith.constant 0.000000e+00 : f64
				// CHECK-DAG: linalg.fill(%[[E0]], %[[M]]) : f64, memref<2x3x4xf64>
				// CHECK-DAG: %[[IndS:.*]] = memref.alloca() : memref<3xindex>
				// CHECK-DAG: %[[IndD:.*]] = memref.cast %[[IndS]] : memref<3xindex> to memref<?xindex>
				// CHECK-DAG: %[[ElemBuffer:.*]] = memref.alloca() : memref<f64>
				//
				// CHECK-DAG: %[[AttrsS:.*]] = memref.alloca() : memref<3xi8>
				// CHECK-DAG: %[[AttrsD:.*]] = memref.cast %[[AttrsS]] : memref<3xi8> to memref<?xi8>
				// CHECK-DAG: %[[Attr0:.*]] = arith.constant 0 : i8
				// CHECK-DAG: memref.store %[[Attr0]], %[[AttrsS]][%[[I0]]] : memref<3xi8>
				// CHECK-DAG: memref.store %[[Attr0]], %[[AttrsS]][%[[I1]]] : memref<3xi8>
				// CHECK-DAG: memref.store %[[Attr0]], %[[AttrsS]][%[[I2]]] : memref<3xi8>
				//
				// CHECK-DAG: %[[SizesS:.*]] = memref.alloca() : memref<3xindex>
				// CHECK-DAG: %[[SizesD:.*]] = memref.cast %[[SizesS]] : memref<3xindex> to memref<?xindex>
				// CHECK-DAG: memref.store %[[I2]], %[[SizesS]][%[[I0]]] : memref<3xindex>
				// CHECK-DAG: memref.store %[[I3]], %[[SizesS]][%[[I1]]] : memref<3xindex>
				// CHECK-DAG: memref.store %[[I4]], %[[SizesS]][%[[I2]]] : memref<3xindex>
				//
				// CHECK-DAG: %[[PermS:.*]] = memref.alloca() : memref<3xindex>
				// CHECK-DAG: %[[PermD:.*]] = memref.cast %[[PermS]] : memref<3xindex> to memref<?xindex>
				// CHECK-DAG: memref.store %[[I0]], %[[PermS]][%[[I0]]] : memref<3xindex>
				// CHECK-DAG: memref.store %[[I1]], %[[PermS]][%[[I1]]] : memref<3xindex>
				// CHECK-DAG: memref.store %[[I2]], %[[PermS]][%[[I2]]] : memref<3xindex>
				//
				// CHECK-DAG: %[[ActionToIter:.*]] = arith.constant 5 : i32
				// CHECK: %[[Iter:.]] = call @newSparseTensor(%[[AttrsD]], %[[SizesD]], %[[PermD]], %{{.}}, %{{.}}, %{{.}}, %[[ActionToIter]], %[[Arg]]) : (memref<?xi8>, memref<?xindex>, memref<?xindex>, i32, i32, i32, i32, !llvm.ptr<i8>) -> !llvm.ptr<i8>
				// CHECK: scf.while : () -> () {
				// CHECK: %[[Cond:.*]] = call @getNextF64(%[[Iter]], %[[IndD]], %[[ElemBuffer]]) : (!llvm.ptr<i8>, memref<?xindex>, memref<f64>) -> i1
				// CHECK: scf.condition(%[[Cond]])
				// CHECK: } do {
				// CHECK: %[[Iv0:.*]] = memref.load %[[IndS]][%[[I0]]] : memref<3xindex>
				// CHECK: %[[Iv1:.*]] = memref.load %[[IndS]][%[[I1]]] : memref<3xindex>
				// CHECK: %[[Iv2:.*]] = memref.load %[[IndS]][%[[I2]]] : memref<3xindex>
				// CHECK: %[[ElemVal:.*]] = memref.load %[[ElemBuffer]][] : memref<f64>
				// CHECK: memref.store %[[ElemVal]], %[[M]][%[[Iv0]], %[[Iv1]], %[[Iv2]]] : memref<2x3x4xf64>
				// CHECK: scf.yield
				// CHECK: }
				// CHECK: %[[T:.*]] = memref.tensor_load %[[M]] : memref<2x3x4xf64>
				// CHECK: return %[[T]] : tensor<2x3x4xf64>
				func @sparse_convert_3d(%arg0: tensor<2x3x4xf64, #SparseTensor>) -> tensor<2x3x4xf64> {
				%0 = sparse_tensor.convert %arg0 : tensor<2x3x4xf64, #SparseTensor> to tensor<2x3x4xf64>
				return %0 : tensor<2x3x4xf64>
				}

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir

This file was added.

				// RUN: mlir-opt %s \
				// RUN: -sparsification -sparse-tensor-conversion \
				// RUN: -linalg-bufferize -convert-linalg-to-loops \
				// RUN: -convert-vector-to-scf -convert-scf-to-std \
				// RUN: -func-bufferize -tensor-constant-bufferize -tensor-bufferize \
				// RUN: -std-bufferize -finalizing-bufferize \
				// RUN: -convert-vector-to-llvm -convert-memref-to-llvm -convert-std-to-llvm \
				// RUN: -reconcile-unrealized-casts \
				// RUN: \| \
				// RUN: mlir-cpu-runner \
				// RUN: -e entry -entry-point-result=void \
				// RUN: -shared-libs=%mlir_integration_test_dir/libmlir_c_runner_utils%shlibext

				#Tensor1 = #sparse_tensor.encoding<{
				dimLevelType = [ "compressed", "compressed", "compressed" ],
				dimOrdering = affine_map<(i,j,k) -> (i,j,k)>
				}>

				#Tensor2 = #sparse_tensor.encoding<{
				dimLevelType = [ "compressed", "compressed", "compressed" ],
				dimOrdering = affine_map<(i,j,k) -> (j,k,i)>
				}>

				#Tensor3 = #sparse_tensor.encoding<{
				dimLevelType = [ "compressed", "compressed", "compressed" ],
				dimOrdering = affine_map<(i,j,k) -> (k,i,j)>
				}>

				#Tensor4 = #sparse_tensor.encoding<{
				dimLevelType = [ "dense", "compressed", "compressed" ],
				dimOrdering = affine_map<(i,j,k) -> (i,j,k)>
				}>

				#Tensor5 = #sparse_tensor.encoding<{
				dimLevelType = [ "dense", "compressed", "compressed" ],
				dimOrdering = affine_map<(i,j,k) -> (j,k,i)>
				}>

				#Tensor6 = #sparse_tensor.encoding<{
				dimLevelType = [ "dense", "compressed", "compressed" ],
				dimOrdering = affine_map<(i,j,k) -> (k,i,j)>
				}>

				//
				// Integration test that tests conversions between sparse tensors.
				aartbikUnsubmitted Done Reply Inline Actions from dense to sparse aartbik: from dense to sparse
				//
				module {
				//
				// Verification utilities.
				//
				func private @exit(index) -> ()
				func private @printInqualityF64(index, index, index, index, f64, f64) -> i1
				func @checkTensor(%name: index, %arg0: tensor<2x3x4xf64>, %arg1: tensor<2x3x4xf64>) {
				%c0 = arith.constant 0 : index
				%c1 = arith.constant 1 : index
				%c2 = arith.constant 2 : index
				%c3 = arith.constant 3 : index
				%c4 = arith.constant 4 : index
				scf.for %i = %c0 to %c2 step %c1 {
				scf.for %j = %c0 to %c3 step %c1 {
				scf.for %k = %c0 to %c4 step %c1 {
				%a = tensor.extract %arg0[%i, %j, %k] : tensor<2x3x4xf64>
				%b = tensor.extract %arg1[%i, %j, %k] : tensor<2x3x4xf64>
				// TODO(wrengr): figure out a better way to print the
				// diagnostics on failure.
				%c = call @printInqualityF64(%name, %i, %j, %k, %a, %b) : (index, index, index, index, f64, f64) -> i1
				scf.if %c {
				call @exit(%c1) : (index) -> ()
				}
				aartbikUnsubmitted Done Reply Inline Actions I think you are overthinking this I agree that the silent exit(1) is not good (so perhaps we should just go back to fully FileChecked cases) But how about just printing -12345 on error and exiting and then also add a CHECK-NOT -12345 or something like that but again, I actually prefer the fully FileChecked cases and just rely on printing and verifying the output aartbik: I think you are overthinking this I agree that the silent exit(1) is not good (so perhaps we…
				}
				}
				}
				return
				}

				//
				// Main driver.
				//
				func @entry() {
				//
				// Initialize a 3-dim dense tensor.
				//
				aartbikUnsubmitted Done Reply Inline Actions this is taken from the original sparse_conversion.mlir but with your addition it feels like we should (1) remove this comment, or (2) add a comment for tensor4,5,6 as well aartbik: this is taken from the original sparse_conversion.mlir but with your addition it feels like we…
				%t = arith.constant dense<[
				[ [ 1.0, 2.0, 3.0, 4.0 ],
				[ 5.0, 6.0, 7.0, 8.0 ],
				[ 9.0, 10.0, 11.0, 12.0 ] ],
				[ [ 13.0, 14.0, 15.0, 16.0 ],
				[ 17.0, 18.0, 19.0, 20.0 ],
				[ 21.0, 22.0, 23.0, 24.0 ] ]
				]> : tensor<2x3x4xf64>

				//
				// Convert dense tensor directly to various sparse tensors.
				// tensor1: stored as 2x3x4
				// tensor2: stored as 3x4x2
				// tensor3: stored as 4x2x3
				//
				%1 = sparse_tensor.convert %t : tensor<2x3x4xf64> to tensor<2x3x4xf64, #Tensor1>
				%2 = sparse_tensor.convert %t : tensor<2x3x4xf64> to tensor<2x3x4xf64, #Tensor2>
				%3 = sparse_tensor.convert %t : tensor<2x3x4xf64> to tensor<2x3x4xf64, #Tensor3>
				%4 = sparse_tensor.convert %t : tensor<2x3x4xf64> to tensor<2x3x4xf64, #Tensor4>
				%5 = sparse_tensor.convert %t : tensor<2x3x4xf64> to tensor<2x3x4xf64, #Tensor5>
				%6 = sparse_tensor.convert %t : tensor<2x3x4xf64> to tensor<2x3x4xf64, #Tensor6>

				//
				// Convert sparse tensor back to dense.
				//
				%a = sparse_tensor.convert %1 : tensor<2x3x4xf64, #Tensor1> to tensor<2x3x4xf64>
				%b = sparse_tensor.convert %2 : tensor<2x3x4xf64, #Tensor2> to tensor<2x3x4xf64>
				%c = sparse_tensor.convert %3 : tensor<2x3x4xf64, #Tensor3> to tensor<2x3x4xf64>
				%d = sparse_tensor.convert %4 : tensor<2x3x4xf64, #Tensor4> to tensor<2x3x4xf64>
				%e = sparse_tensor.convert %5 : tensor<2x3x4xf64, #Tensor5> to tensor<2x3x4xf64>
				%f = sparse_tensor.convert %6 : tensor<2x3x4xf64, #Tensor6> to tensor<2x3x4xf64>

				//
				// Check round-trip equality.
				//
				%nameA = arith.constant 97 : index
				%nameB = arith.constant 98 : index
				%nameC = arith.constant 99 : index
				%nameD = arith.constant 100 : index
				%nameE = arith.constant 101 : index
				%nameF = arith.constant 102 : index
				call @checkTensor(%nameA, %t, %a) : (index, tensor<2x3x4xf64>, tensor<2x3x4xf64>) -> ()
				call @checkTensor(%nameB, %t, %b) : (index, tensor<2x3x4xf64>, tensor<2x3x4xf64>) -> ()
				call @checkTensor(%nameC, %t, %c) : (index, tensor<2x3x4xf64>, tensor<2x3x4xf64>) -> ()
				call @checkTensor(%nameD, %t, %d) : (index, tensor<2x3x4xf64>, tensor<2x3x4xf64>) -> ()
				call @checkTensor(%nameE, %t, %e) : (index, tensor<2x3x4xf64>, tensor<2x3x4xf64>) -> ()
				call @checkTensor(%nameF, %t, %f) : (index, tensor<2x3x4xf64>, tensor<2x3x4xf64>) -> ()

				// Release the resources.
				// TODO(wrengr): what's the proper way to release a dense tensor?
				// We can't just memref.dealloc since it's not a memref anymore...
				aartbikUnsubmitted Done Reply Inline Actions yeah, we need to do that or asan will fail Proper way, I think (1) memref.buffer_cast from tensor to memref (2) dealloc memref aartbik: yeah, we need to do that or asan will fail Proper way, I think (1) memref.buffer_cast from…
				sparse_tensor.release %1 : tensor<2x3x4xf64, #Tensor1>
				sparse_tensor.release %2 : tensor<2x3x4xf64, #Tensor2>
				sparse_tensor.release %3 : tensor<2x3x4xf64, #Tensor3>
				sparse_tensor.release %4 : tensor<2x3x4xf64, #Tensor4>
				sparse_tensor.release %5 : tensor<2x3x4xf64, #Tensor5>
				sparse_tensor.release %6 : tensor<2x3x4xf64, #Tensor6>

				return
				}
				}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse] Implementing sparse=>dense conversion.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 382805

mlir/include/mlir/Dialect/SparseTensor/Transforms/Passes.td

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorPasses.cpp

mlir/lib/ExecutionEngine/SparseUtils.cpp

mlir/test/Dialect/SparseTensor/conversion_sparse2dense.mlir

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir

[mlir][sparse] Implementing sparse=>dense conversion.
ClosedPublic