This is an archive of the discontinued LLVM Phabricator instance.

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
438–471	Technically, we also fall into this branch for the strange dense to dense conversion. We will have typically folded those away, but I would not completely rely on this having taken place always and defend against that in the code.
444–446	Note that in the context of another project, we may migrate library code to actual codegen (which has the advantage of a smaller memory footprint potentially and allows for "unforeseen" type combinations). Such a migration may take care of all such performance concerns.
452	I think we need a subtle rewriting of the new call utility since we need the sparse encoding for some info, but the id permutation for the "new" tensor (ie. the dense result)
mlir/lib/ExecutionEngine/SparseUtils.cpp
145–146	this sentence does not flow quite well, or am I reading it wrong?
603–604	We really should not need permutation here. If you call toCOO with ID permutation, it restores the indices to original order. Note that toCOO takes the permutation of the target, and internally restores permutation, if there was one from source (the internally stored inverse permutation). So if you call toCOO with ID perm, you don't need the perm here anymore, since indices are in the natural "dense" order MLIR expects.

Also, please add tests for this

(1) in mlir/test/Dialect/SparseTensor conversion, a CHECK test on the expected loop structure (see examples there)
(2) in mlir/test/Integration/Dialect/SparseTensor/CPU, add a new "end-to-end" test that does a dense->sparse->dense roundtrip

wrengr mentioned this in rGca010347145d: [mlir][sparse] Factoring out getZero() and avoiding unnecessary Type params.Oct 1 2021, 2:18 PM

wrengr mentioned this in rG14fffda979ae: [mlir][sparse] Factoring out allocaIndices().

wrengr mentioned this in rGaf7ac1d95b7d: [mlir][sparse] Sharing calls to adaptor.getOperands()[0].Oct 1 2021, 2:20 PM

bixia added inline comments.Oct 1 2021, 3:12 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
452	this should be encSrc, not encDst, right?
mlir/lib/ExecutionEngine/SparseUtils.cpp
111	nested

wrengr marked 2 inline comments as done.Oct 1 2021, 3:30 PM

wrengr added inline comments.

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
438–471	Just to be clear, that should only ever happen when the source is a "sparse" tensor that happens to use dense storage for all dimensions, right? Which is to say, the destination is always a bog-standard dense tensor, right? If so then the code should still work; if not then we'll have to figure out how to tighten up the guards for detecting the different cases. I'll update the commentary (on the assumption that the answer to the first question is yes; of course, if that's true, then shouldn't the "sparse=>sparse" case have the same caveat?)
444–446	Yep, that's what I was thinking re "good enough for now" :) I mainly added the comment since when discussing the design with Tatiana, she raised some concerns about the performance implications of the method/function calls. For our current goals, I can't imagine this branch would be taken often enough to constitute a performance bottleneck. (And whenever it does, it's easy enough to fix at that point.)
452	Yeah, I've been mulling over a few ways to clean genNewCall up. Did you want me to do that in/before this differential, or is it okay to do afterwards?
mlir/lib/ExecutionEngine/SparseUtils.cpp
145–146	I'll try rewording
603–604	Oh good. I thought that's how things worked, but I wasn't quite certain.

Addressing comments

Harbormaster completed remote builds in B126751: Diff 376644.Oct 1 2021, 3:32 PM

aartbik added inline comments.Oct 4 2021, 9:14 AM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
438–471	No, I really means no encoding at all. There is a subtle difference between an un-annotated tensor and a tensor with all-dense annotations. All conversions work for the all-dense annotated case (it is treated as a sort of sparse tensors). But the logic on falling into a branch based on encDst, !encDest, srcDest, !srcDest (so four truth values) fell into this branch for two cases, but you only implemented the sparse->dense. So you will have to add one if-test and return failure.

Re-updating commentary about covering dense=>sparse not dense=>dense. Also some preliminary WIP towards adding tests.

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
438–471	Got it

wrengr marked 2 inline comments as not done.Oct 4 2021, 2:09 PM

Harbormaster completed remote builds in B126929: Diff 377031.Oct 4 2021, 3:06 PM

Debugging some errors

wrengr added inline comments.Oct 4 2021, 3:52 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
452	Good catch :) That lead to an utterly inscrutable crash stacktrace

Harbormaster completed remote builds in B126937: Diff 377045.Oct 4 2021, 4:17 PM

wrengr updated this revision to Diff 377317.Oct 5 2021, 11:33 AM

(more debugging attempt; not much progress)

Harbormaster completed remote builds in B127125: Diff 377317.Oct 5 2021, 11:47 AM

aartbik added inline comments.Oct 5 2021, 1:40 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
441	using the "zero" data value itself has a bit of a risk that a sparse data structure with a few explicitly stored zeros will bail this loop too soon; having a 0/1 result and passing the value as another ref parameter seems safer
467	you cannot just pass "indices" to next call and use it here; you will need the IR to load the contents returned through the memref by the getNext() call, using explicit memref::LoadOp ops on the elements in the memref and passing this to the StoreOp
470	You are replacing the op (which returns a tensor) with the result of an alloc (which is a memref). That type mismatch will fail. You need a buffer cast in between.
470	This replacement will also need some legality check changes. Up to now, we were replacing sparse tensors with opague pointers, and the checks/rewriting did all the work But now we have dense_tensor = convert .... return dense_tenor and the mechanism will need some "love" to make it accept the rewriting even though the types were already legal to start with

rebased, and factored out D111763 and D111766

wrengr added inline comments.Oct 14 2021, 2:57 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
441	Gross; but yes, you're right. I think I'll leave the commentary here as-is, however, since —like the field accesses on line 452, and like the commentary for other branches— it's reflecting more what the C++ code in the ExecutionEngine does, rather than reflecting what the generated MLIR code does; and there's no ambiguity about what the `getNext()` method returns, even though `IMPL_COO_GETNEXT` introduces the problem you mention.

wrengr marked an inline comment as done.Oct 14 2021, 2:57 PM

Harbormaster completed remote builds in B128961: Diff 379851.Oct 14 2021, 2:58 PM

Everything should now be ready for review. I've added integration and codegen tests, fixed various infelicities, and rebased.

N.B., I'm planning to add support for dynamic sizes, but I want to do that in a separate differential

Harbormaster completed remote builds in B131047: Diff 382805.Oct 27 2021, 3:10 PM

aartbik added inline comments.Oct 27 2021, 3:40 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
128	returns (to be consistent with Generates) also "as" instead of "at"? (2x)
130	is the "inline" really important (here and below) internal linkage gives the compiler sufficient room to make that decision using built-in heuristics?
262	we typically do not show that much detail of the C++ lib in this file
269	This is not a style we use in MLIR, and especially for an internally linked method, and compared to the surrounding methods, it feels a bit heavy
270	let's not document SparseUtils inside this file
300	not in this revision yet? seems a relatively minor addition (and then we are done for all cases!)
314	same
444–446	remove this; it more or less applies to all method calls here, and it will fall under the general umbrella of perhaps moving to 100% codegen over support lib usage....
450	we don't emit errors when rewriting rules fail to apply
459	all this block scoping makes the code more lengthy than it could be I would either break this up in methods where it makes sense, or otherwise just take the scoping hit for readability
480	shouldn't we bail out if there are dynamic sizes? or, better yet, just add those in this revision
mlir/lib/ExecutionEngine/SparseUtils.cpp
150–151	this class adds an enormous amount of code for a very thin iterator how about just having the very simple start()/next() iteration inside the COO class itself? That way, we can also assert an error if you try to insert while iterating
557	assert(action == kEmpty); disappeared in your rewrite
557	this is of course inside a macro now, but LLVM wants braces {} on all branches if one of them is branched
644	if you use a shorter name, perhaps it fits on one line
757–766	I prefer not to have this function at all; but if we keep it, it should go to CRunnerUtils.cpp since we have some other printing utilities there but this is not related to sparse
mlir/test/Dialect/SparseTensor/conversion_sparse2dense.mlir
16 ↗	(On Diff #382805)	Ah, you made this much more accurate checking than conversion.mlir does (good). But, let's also add a check in that file in the more informal way there
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir
45 ↗	(On Diff #382805)	from dense to sparse
64–69 ↗	(On Diff #382805)	I think you are overthinking this I agree that the silent exit(1) is not good (so perhaps we should just go back to fully FileChecked cases) But how about just printing -12345 on error and exiting and then also add a CHECK-NOT -12345 or something like that but again, I actually prefer the fully FileChecked cases and just rely on printing and verifying the output
132–133 ↗	(On Diff #382805)	yeah, we need to do that or asan will fail Proper way, I think (1) memref.buffer_cast from tensor to memref (2) dealloc memref

In D110790#3091638, @wrengr wrote:

N.B., I'm planning to add support for dynamic sizes, but I want to do that in a separate differential

Ah, I saw this after writing my comments. So you can ignore the part where I asked about that ;-)

Added child differential D112674 for handling dynamic sizes

wrengr added a child revision: D112674: [mlir][sparse] Adding dynamic-size support for sparse=>dense conversion.Oct 27 2021, 4:08 PM

Harbormaster completed remote builds in B131068: Diff 382836.Oct 27 2021, 4:20 PM

Addressing comments

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
300	It's done in D112674.
452	Ha, actually I was right(-ish) the first time. We want the destination encoding here, so we don't apply the dimOrdering permutation twice. The problem was that I was passing a nullptr rather than explicitly constructing the SparseTensorEncodingAttr
480	Done in D112674
mlir/lib/ExecutionEngine/SparseUtils.cpp
757–766	Yeah, I'd rather not have this function too. Unfortunately, when I tried using the CRunnerUtils.cpp functions that vector.print does, I could never get it to link right: it either complained about re-defining a symbol, or about using an un-defined symbol. As for implementing it in the integration test itself, I can't seem to find a way to define string values (not attributes) for passing to fputs() Moved to CRunnerUtils.cpp for now. Will try to rip it out in a future differential.
mlir/test/Dialect/SparseTensor/conversion_sparse2dense.mlir
16 ↗	(On Diff #382805)	I'm not sure I follow. I was just trying to do the same as conversion.mlir (but in the opposite direction, naturally); and elsewhere you suggested breaking things out into a separate test rather than reusing the ones already there.

Harbormaster completed remote builds in B131083: Diff 382862.Oct 27 2021, 5:38 PM

aartbik added inline comments.Oct 27 2021, 6:42 PM

mlir/lib/ExecutionEngine/SparseUtils.cpp
757–766	How about just leaving it at the exit(1) for not and think about this in a future differential. Rather than introducing something we want to rip again anyway?
mlir/test/Dialect/SparseTensor/conversion_sparse2dense.mlir
16 ↗	(On Diff #382805)	Ok, fair enough. I was suggesting to have one sparse->dense check in the conversion.mlir in the more concise style of that test, but having this more rigid tests is indeed sufficient. You can ignore the suggestion ;-)

aartbik added inline comments.Oct 27 2021, 8:42 PM

mlir/lib/ExecutionEngine/CRunnerUtils.cpp
47–58 ↗	(On Diff #382862)	Let's not do this at all, but let's go with just using CHECKs I moved the sparse_conversion test back to this format as well: https://reviews.llvm.org/D112688

Addressing comments

Herald added a subscriber: mgrang. · View Herald TranscriptOct 28 2021, 12:45 PM

Harbormaster completed remote builds in B131271: Diff 383129.Oct 28 2021, 12:54 PM

yeah, looking good. few last nits before we submit...

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
311	Inserts (I usually use the s-form of a verb in the top level comment, but the imperative form in the inlined code)
318–319	is this comment still relevant? I think we can safely remove it?
478	you still have some block scoping left? none of these really release stuff early to keep memory lower so I would opt for readability over block scoping
mlir/lib/ExecutionEngine/SparseUtils.cpp
96	yeah, this is awesome!
120	iteratorLocked = false?
752	feel free to rename this into more intuitive names in a follow up revision too, btw

Addressing nits

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
478	Imo the blocks improve readability rather than detract.
mlir/lib/ExecutionEngine/SparseUtils.cpp
752	Will do

aartbik added inline comments.Oct 28 2021, 1:34 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
452–463	how about removing this comment block and simply using // Setup a synthetic all-dense, no-permutation encoding for the dense destination. encDst = SparseTensorEncodingAttr::get( op->getContext(), SmallVector<SparseTensorEncodingAttr::DimLevelType>( rank, SparseTensorEncodingAttr::DimLevelType::Dense), AffineMap(), 0, 0); we don't need anything copied from src here.

Harbormaster completed remote builds in B131281: Diff 383144.Oct 28 2021, 1:37 PM

wrengr marked an inline comment as done.Oct 28 2021, 1:59 PM

wrengr added inline comments.

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
452–463	We need the (ptrTp,indTp,valTp) values for _mlir_ciface_newSparseTensor to enter the right CASE. But I can abbreviate the commentary

Addressing nits

aartbik added inline comments.Oct 28 2021, 2:09 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
452–463	this last one may have crossed. I just feel L601-612 takes too much real estate explaining what could be a onliner
478	Ok, acceptable ;-)
mlir/lib/ExecutionEngine/SparseUtils.cpp
604	tensor is a bit more consistent name with IMPL3
613	oh, we will leak memory right now! When iteration is done, we should release the COO tensor/ The easiest way would be to do this if (elem == nullptr) { delete iter; return false; } and document that tensor can no longer be used once getNext returns false
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir
79–81 ↗	(On Diff #383144)	this is taken from the original sparse_conversion.mlir but with your addition it feels like we should (1) remove this comment, or (2) add a comment for tensor4,5,6 as well

aartbik added inline comments.Oct 28 2021, 2:12 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
452–463	Oh yeah, right you are. Okay, then indeed just a bit less comment, but same code ;-)
mlir/lib/ExecutionEngine/SparseUtils.cpp
613	this is the most important issue, since without this fix, asan will break the test

Harbormaster completed remote builds in B131293: Diff 383158.Oct 28 2021, 2:19 PM

wrengr marked 7 inline comments as done.Oct 28 2021, 2:46 PM

Fixing memory leak

Ship it, Wren!

This revision is now accepted and ready to land.Oct 28 2021, 3:02 PM

Harbormaster completed remote builds in B131302: Diff 383170.Oct 28 2021, 3:05 PM

Closed by commit rG28882b6575d2: [mlir][sparse] Implementing sparse=>dense conversion. (authored by wrengr). · Explain WhyOct 28 2021, 3:27 PM

This revision was automatically updated to reflect the committed changes.

wrengr added a commit: rG28882b6575d2: [mlir][sparse] Implementing sparse=>dense conversion..

aartbik mentioned this in D112779: [mlir][sparse] fix broken asan test.Oct 28 2021, 8:43 PM

aartbik mentioned this in rG00040d734960: [mlir][sparse] fix broken asan test.Oct 28 2021, 8:54 PM

wrengr mentioned this in D112854: [mlir][sparse] Improve handling of dynamic-sizes for sparse=>dense conversion.Oct 29 2021, 5:09 PM

Revision Contents

Path

Size

mlir/

lib/

Dialect/

SparseTensor/

Transforms/

SparseTensorConversion.cpp

115 lines

ExecutionEngine/

SparseUtils.cpp

81 lines

Diff 376318

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp

Show First 20 Lines • Show All 119 Lines • ▼ Show 20 Lines

/// Generates a call into the "swiss army knife" method of the sparse runtime		/// Generates a call into the "swiss army knife" method of the sparse runtime
/// support library for materializing sparse tensors into the computation. The		/// support library for materializing sparse tensors into the computation. The
/// method returns the call value and assigns the permutation to 'perm'.		/// method returns the call value and assigns the permutation to 'perm'.
static Value genNewCall(ConversionPatternRewriter &rewriter, Operation *op,		static Value genNewCall(ConversionPatternRewriter &rewriter, Operation *op,
SparseTensorEncodingAttr &enc, uint32_t action,		SparseTensorEncodingAttr &enc, uint32_t action,
Value &perm, Value ptr = Value()) {		Value &perm, Value ptr = Value()) {
Location loc = op->getLoc();		Location loc = op->getLoc();
ShapedType resType = op->getResult(0).getType().cast<ShapedType>();		ShapedType resType = op->getResult(0).getType().cast<ShapedType>();
		aartbikUnsubmitted Done Reply Inline Actions returns (to be consistent with Generates) also "as" instead of "at"? (2x) aartbik: returns (to be consistent with Generates) also "as" instead of "at"? (2x)
SmallVector<Value, 8> params;		SmallVector<Value, 8> params;
// Sparsity annotations in tensor constant form.		// Sparsity annotations in tensor constant form.
		aartbikUnsubmitted Done Reply Inline Actions is the "inline" really important (here and below) internal linkage gives the compiler sufficient room to make that decision using built-in heuristics? aartbik: is the "inline" really important (here and below) internal linkage gives the compiler…
SmallVector<APInt, 4> attrs;		SmallVector<APInt, 4> attrs;
unsigned sz = enc.getDimLevelType().size();		unsigned sz = enc.getDimLevelType().size();
for (unsigned i = 0; i < sz; i++)		for (unsigned i = 0; i < sz; i++)
attrs.push_back(		attrs.push_back(
APInt(8, getDimLevelTypeEncoding(enc.getDimLevelType()[i])));		APInt(8, getDimLevelTypeEncoding(enc.getDimLevelType()[i])));
params.push_back(getTensor(rewriter, 8, loc, attrs));		params.push_back(getTensor(rewriter, 8, loc, attrs));
// Dimension sizes array of the enveloping dense tensor. Useful for either		// Dimension sizes array of the enveloping dense tensor. Useful for either
// verification of external data, or for construction of internal data.		// verification of external data, or for construction of internal data.
Show All 38 Lines	static Value genNewCall(ConversionPatternRewriter &rewriter, Operation *op,
// Generate the call to create new tensor.		// Generate the call to create new tensor.
StringRef name = "newSparseTensor";		StringRef name = "newSparseTensor";
auto call = rewriter.create<CallOp>(		auto call = rewriter.create<CallOp>(
loc, pTp, getFunc(op, name, pTp, params, /emitCInterface=/true),		loc, pTp, getFunc(op, name, pTp, params, /emitCInterface=/true),
params);		params);
return call.getResult(0);		return call.getResult(0);
}		}

		/// Generates a constant zero of the appropriate type.
		static Value getZero(ConversionPatternRewriter &r, Location loc, Type t) {
		return r.create<ConstantOp>(loc, r.getZeroAttr(t));
		}

/// Generates the comparison `v != 0` where `v` is of numeric type `t`.		/// Generates the comparison `v != 0` where `v` is of numeric type `t`.
/// For floating types, we use the "unordered" comparator (i.e., returns		/// For floating types, we use the "unordered" comparator (i.e., returns
/// true if `v` is NaN).		/// true if `v` is NaN).
		// TODO(wrengr): use v->getType() in lieu of having the Type parameter?
static Value genIsNonzero(ConversionPatternRewriter &rewriter, Location loc,		static Value genIsNonzero(ConversionPatternRewriter &rewriter, Location loc,
Type t, Value v) {		Type t, Value v) {
Value zero = rewriter.create<ConstantOp>(loc, rewriter.getZeroAttr(t));		Value zero = getZero(rewriter, loc, t);
if (t.isa<FloatType>())		if (t.isa<FloatType>())
return rewriter.create<CmpFOp>(loc, CmpFPredicate::UNE, v, zero);		return rewriter.create<CmpFOp>(loc, CmpFPredicate::UNE, v, zero);
if (t.isIntOrIndex())		if (t.isIntOrIndex())
return rewriter.create<CmpIOp>(loc, CmpIPredicate::ne, v, zero);		return rewriter.create<CmpIOp>(loc, CmpIPredicate::ne, v, zero);
llvm_unreachable("Unknown element type");		llvm_unreachable("Unknown element type");
}		}

/// Generates the code to read the value from tensor[ivs], and conditionally		/// Generates the code to read the value from tensor[ivs], and conditionally
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	static void genAddEltCall(ConversionPatternRewriter &rewriter, Operation *op,
params.push_back(ind);		params.push_back(ind);
params.push_back(perm);		params.push_back(perm);
Type pTp = LLVM::LLVMPointerType::get(IntegerType::get(op->getContext(), 8));		Type pTp = LLVM::LLVMPointerType::get(IntegerType::get(op->getContext(), 8));
rewriter.create<CallOp>(		rewriter.create<CallOp>(
loc, pTp, getFunc(op, name, pTp, params, /emitCInterface=/true),		loc, pTp, getFunc(op, name, pTp, params, /emitCInterface=/true),
params);		params);
}		}

/// If the tensor is a sparse constant, generates and returns the pair of		/// If the tensor is a sparse constant, generates and returns the pair of
		aartbikUnsubmitted Done Reply Inline Actions we typically do not show that much detail of the C++ lib in this file aartbik: we typically do not show that much detail of the C++ lib in this file
/// the constants for the indices and the values.		/// the constants for the indices and the values.
static Optional<std::pair<Value, Value>>		static Optional<std::pair<Value, Value>>
genSplitSparseConstant(ConversionPatternRewriter &rewriter, ConvertOp op,		genSplitSparseConstant(ConversionPatternRewriter &rewriter, ConvertOp op,
Value tensor) {		Value tensor) {
if (auto constOp = tensor.getDefiningOp<ConstantOp>()) {		if (auto constOp = tensor.getDefiningOp<ConstantOp>()) {
if (auto attr = constOp.value().dyn_cast<SparseElementsAttr>()) {		if (auto attr = constOp.value().dyn_cast<SparseElementsAttr>()) {
Location loc = op->getLoc();		Location loc = op->getLoc();
		aartbikUnsubmitted Done Reply Inline Actions This is not a style we use in MLIR, and especially for an internally linked method, and compared to the surrounding methods, it feels a bit heavy aartbik: This is not a style we use in MLIR, and especially for an internally linked method, and…
DenseElementsAttr indicesAttr = attr.getIndices();		DenseElementsAttr indicesAttr = attr.getIndices();
		aartbikUnsubmitted Done Reply Inline Actions let's not document SparseUtils inside this file aartbik: let's not document SparseUtils inside this file
Value indices = rewriter.create<ConstantOp>(loc, indicesAttr);		Value indices = rewriter.create<ConstantOp>(loc, indicesAttr);
DenseElementsAttr valuesAttr = attr.getValues();		DenseElementsAttr valuesAttr = attr.getValues();
Value values = rewriter.create<ConstantOp>(loc, valuesAttr);		Value values = rewriter.create<ConstantOp>(loc, valuesAttr);
return std::make_pair(indices, values);		return std::make_pair(indices, values);
}		}
}		}
return {};		return {};
}		}
Show All 10 Lines	for (unsigned i = 0; i < rank; i++) {
Value val = rewriter.create<tensor::ExtractOp>(loc, indices,		Value val = rewriter.create<tensor::ExtractOp>(loc, indices,
ValueRange{ivs[0], idx});		ValueRange{ivs[0], idx});
val = rewriter.create<IndexCastOp>(loc, val, rewriter.getIndexType());		val = rewriter.create<IndexCastOp>(loc, val, rewriter.getIndexType());
rewriter.create<memref::StoreOp>(loc, val, ind, idx);		rewriter.create<memref::StoreOp>(loc, val, ind, idx);
}		}
return rewriter.create<tensor::ExtractOp>(loc, values, ivs[0]);		return rewriter.create<tensor::ExtractOp>(loc, values, ivs[0]);
}		}

		/// Generates a call to SparseTensorCOO<V>::Iterator::getNext()
		/// If there is a next `Element<V>`: the `indices` will be filled from
		/// that element, and the returned `Value` will be the `V` of that element.
		/// If there is no next `Element<V>`: the `indices` will be left in an
		aartbikUnsubmitted Done Reply Inline Actions not in this revision yet? seems a relatively minor addition (and then we are done for all cases!) aartbik: not in this revision yet? seems a relatively minor addition (and then we are done for all cases!
		wrengrAuthorUnsubmitted Done Reply Inline Actions It's done in D112674. wrengr: It's done in D112674.
		/// indeterminate state (in practice it'll be left unmodified), and
		/// the returned `Value` is zero-- which (by definition) is never a
		/// valid `V` for `SparseTensorCOO<V>` to contain, so there's no chance
		/// of confusion, nor any loss of expressivity.
		static Value genGetNextCall(ConversionPatternRewriter &rewriter, Operation *op,
		Value iter, Value indices, Value perm) {
		Location loc = op->getLoc();
		Type elemTp = iter.getType().cast<ShapedType>().getElementType();
		StringRef name;
		if (elemTp.isF64())
		name = "getNextF64";
		aartbikUnsubmitted Done Reply Inline Actions Inserts (I usually use the s-form of a verb in the top level comment, but the imperative form in the inlined code) aartbik: Inserts (I usually use the s-form of a verb in the top level comment, but the imperative form…
		else if (elemTp.isF32())
		name = "getNextF32";
		else if (elemTp.isInteger(64))
		aartbikUnsubmitted Done Reply Inline Actions same aartbik: same
		name = "getNextI64";
		else if (elemTp.isInteger(32))
		name = "getNextI32";
		else if (elemTp.isInteger(16))
		name = "getNextI16";
		aartbikUnsubmitted Done Reply Inline Actions is this comment still relevant? I think we can safely remove it? aartbik: is this comment still relevant? I think we can safely remove it?
		else if (elemTp.isInteger(8))
		name = "getNextI8";
		else
		llvm_unreachable("Unknown element type");
		SmallVector<Value, 8> params;
		params.push_back(iter);
		params.push_back(indices);
		params.push_back(perm);
		auto call = rewriter.create<CallOp>(
		loc, elemTp, getFunc(op, name, elemTp, params, /emitCInterface=/true),
		params);
		return call.getResult(0);
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Conversion rules.		// Conversion rules.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Sparse conversion rule for returns.		/// Sparse conversion rule for returns.
class SparseReturnConverter : public OpConversionPattern<ReturnOp> {		class SparseReturnConverter : public OpConversionPattern<ReturnOp> {
public:		public:
using OpConversionPattern::OpConversionPattern;		using OpConversionPattern::OpConversionPattern;
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	if (!enc)
return failure();		return failure();
Value perm;		Value perm;
rewriter.replaceOp(		rewriter.replaceOp(
op, genNewCall(rewriter, op, enc, 0, perm, adaptor.getOperands()[0]));		op, genNewCall(rewriter, op, enc, 0, perm, adaptor.getOperands()[0]));
return success();		return success();
}		}
};		};

		static Value allocaIndices(ConversionPatternRewriter &rewriter, Location loc,
		int64_t rank) {
		auto indexTp = rewriter.getIndexType();
		auto memTp = MemRefType::get({ShapedType::kDynamicSize}, indexTp);
		Value arg = rewriter.create<ConstantOp>(loc, rewriter.getIndexAttr(rank));
		return rewriter.create<memref::AllocaOp>(loc, memTp, ValueRange{arg});
		}

		static Value allocDenseTensor(ConversionPatternRewriter &rewriter, Location loc,
		ShapedType shapeTp) {
		Type elemTp = shapeTp.getElementType();
		Value tensor = rewriter.create<memref::AllocOp>(
		loc, MemRefType::get(shapeTp.getShape(), elemTp));
		rewriter.create<linalg::FillOp>(loc, getZero(rewriter, loc, elemTp), tensor);
		return tensor;
		}

/// Sparse conversion rule for the convert operator.		/// Sparse conversion rule for the convert operator.
class SparseTensorConvertConverter : public OpConversionPattern<ConvertOp> {		class SparseTensorConvertConverter : public OpConversionPattern<ConvertOp> {
using OpConversionPattern::OpConversionPattern;		using OpConversionPattern::OpConversionPattern;
LogicalResult		LogicalResult
matchAndRewrite(ConvertOp op, OpAdaptor adaptor,		matchAndRewrite(ConvertOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
Type resType = op.getType();		Type resType = op.getType();
auto encDst = getSparseTensorEncoding(resType);		auto encDst = getSparseTensorEncoding(resType);
auto encSrc = getSparseTensorEncoding(op.source().getType());		auto encSrc = getSparseTensorEncoding(op.source().getType());
		auto src = adaptor.getOperands()[0];
if (encDst && encSrc) {		if (encDst && encSrc) {
// This is a sparse => sparse conversion, which is handled as follows:		// This is a sparse => sparse conversion, which is handled as follows:
// t = src->toCOO(); ; src to COO in dst order		// t = src->toCOO(); ; src to COO in dst order
// dst = newSparseTensor(t)		// dst = newSparseTensor(t)
// Using the coordinate scheme as an intermediate does not always		// Using the coordinate scheme as an intermediate does not always
// yield the fastest conversion but avoids the need for a full		// yield the fastest conversion but avoids the need for a full
// O(N^2) conversion matrix.		// O(N^2) conversion matrix.
Value perm;		Value perm;
Value coo =		Value coo = genNewCall(rewriter, op, encDst, 3, perm, src);
genNewCall(rewriter, op, encDst, 3, perm, adaptor.getOperands()[0]);
rewriter.replaceOp(op, genNewCall(rewriter, op, encDst, 1, perm, coo));		rewriter.replaceOp(op, genNewCall(rewriter, op, encDst, 1, perm, coo));
return success();		return success();
}		}
if (!encDst \|\| encSrc) {		if (!encDst \|\| encSrc) {
// TODO: sparse => dense		// This is sparse => dense conversion, which is handled as follows:
return failure();		// dst = new MemRef(0);
		// iter = src->toCOO()->getIterator();
		// while (iter->hasNext()) {
		aartbikUnsubmitted Done Reply Inline Actions using the "zero" data value itself has a bit of a risk that a sparse data structure with a few explicitly stored zeros will bail this loop too soon; having a 0/1 result and passing the value as another ref parameter seems safer aartbik: using the "zero" data value itself has a bit of a risk that a sparse data structure with a few…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Gross; but yes, you're right. I think I'll leave the commentary here as-is, however, since —like the field accesses on line 452, and like the commentary for other branches— it's reflecting more what the C++ code in the ExecutionEngine does, rather than reflecting what the generated MLIR code does; and there's no ambiguity about what the `getNext()` method returns, even though `IMPL_COO_GETNEXT` introduces the problem you mention. wrengr: Gross; but yes, you're right. I think I'll leave the commentary here as-is, however, since…
		// elem = iter->getNext();
		// dst[elem.indices] = elem.value;
		// }
		// While it would be more efficient to inline the iterator logic
		// directly rather than allocating an object and calling methods,
		aartbikUnsubmitted Done Reply Inline Actions Note that in the context of another project, we may migrate library code to actual codegen (which has the advantage of a smaller memory footprint potentially and allows for "unforeseen" type combinations). Such a migration may take care of all such performance concerns. aartbik: Note that in the context of another project, we may migrate library code to actual codegen…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Yep, that's what I was thinking re "good enough for now" :) I mainly added the comment since when discussing the design with Tatiana, she raised some concerns about the performance implications of the method/function calls. For our current goals, I can't imagine this branch would be taken often enough to constitute a performance bottleneck. (And whenever it does, it's easy enough to fix at that point.) wrengr: Yep, that's what I was thinking re "good enough for now" :) I mainly added the comment since…
		aartbikUnsubmitted Done Reply Inline Actions remove this; it more or less applies to all method calls here, and it will fall under the general umbrella of perhaps moving to 100% codegen over support lib usage.... aartbik: remove this; it more or less applies to all method calls here, and it will fall under the…
		// this is good enough for now.
		Location loc = op->getLoc();
		ShapedType shapeTp = resType.cast<ShapedType>();
		Type elemTp = shapeTp.getElementType();
		aartbikUnsubmitted Done Reply Inline Actions we don't emit errors when rewriting rules fail to apply aartbik: we don't emit errors when rewriting rules fail to apply
		Value dst = allocDenseTensor(rewriter, loc, shapeTp);
		Value indices = allocaIndices(rewriter, loc, shapeTp.getRank());
		aartbikUnsubmitted Done Reply Inline Actions I think we need a subtle rewriting of the new call utility since we need the sparse encoding for some info, but the id permutation for the "new" tensor (ie. the dense result) aartbik: I think we need a subtle rewriting of the new call utility since we need the sparse encoding…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Yeah, I've been mulling over a few ways to clean genNewCall up. Did you want me to do that in/before this differential, or is it okay to do afterwards? wrengr: Yeah, I've been mulling over a few ways to clean genNewCall up. Did you want me to do that…
		bixiaUnsubmitted Done Reply Inline Actions this should be encSrc, not encDst, right? bixia: this should be encSrc, not encDst, right?
		wrengrAuthorUnsubmitted Done Reply Inline Actions Good catch :) That lead to an utterly inscrutable crash stacktrace wrengr: Good catch :) That lead to an utterly inscrutable crash stacktrace
		wrengrAuthorUnsubmitted Done Reply Inline Actions Ha, actually I was right(-ish) the first time. We want the destination encoding here, so we don't apply the dimOrdering permutation twice. The problem was that I was passing a nullptr rather than explicitly constructing the SparseTensorEncodingAttr wrengr: Ha, actually I was right(-ish) the first time. We want the destination encoding here, so we…
		Value perm;
		Value iter = genNewCall(rewriter, op, encDst, 4, perm, src);
		// Generate the while-loop Op itself.
		TypeRange argTypes{};
		ValueRange args{};
		scf::WhileOp whileOp = rewriter.create<scf::WhileOp>(loc, argTypes, args);
		Block *before = rewriter.createBlock(&whileOp.before(), {}, argTypes);
		aartbikUnsubmitted Done Reply Inline Actions all this block scoping makes the code more lengthy than it could be I would either break this up in methods where it makes sense, or otherwise just take the scoping hit for readability aartbik: all this block scoping makes the code more lengthy than it could be I would either break this…
		Block *after = rewriter.createBlock(&whileOp.after(), {}, argTypes);
		// Build the while-loop's "before" region.
		rewriter.setInsertionPointToEnd(before);
		Value elemVal = genGetNextCall(rewriter, op, iter, indices, perm);
		aartbikUnsubmitted Done Reply Inline Actions how about removing this comment block and simply using // Setup a synthetic all-dense, no-permutation encoding for the dense destination. encDst = SparseTensorEncodingAttr::get( op->getContext(), SmallVector<SparseTensorEncodingAttr::DimLevelType>( rank, SparseTensorEncodingAttr::DimLevelType::Dense), AffineMap(), 0, 0); we don't need anything copied from src here. aartbik: how about removing this comment block and simply using // Setup a synthetic all-dense, no…
		aartbikUnsubmitted Done Reply Inline Actions this last one may have crossed. I just feel L601-612 takes too much real estate explaining what could be a onliner aartbik: this last one may have crossed. I just feel L601-612 takes too much real estate explaining what…
		wrengrAuthorUnsubmitted Done Reply Inline Actions We need the (ptrTp,indTp,valTp) values for _mlir_ciface_newSparseTensor to enter the right CASE. But I can abbreviate the commentary wrengr: We need the (ptrTp,indTp,valTp) values for _mlir_ciface_newSparseTensor to enter the right CASE.
		aartbikUnsubmitted Done Reply Inline Actions Oh yeah, right you are. Okay, then indeed just a bit less comment, but same code ;-) aartbik: Oh yeah, right you are. Okay, then indeed just a bit less comment, but same code ;-)
		Value cond = genIsNonzero(rewriter, loc, elemTp, elemVal);
		rewriter.create<scf::ConditionOp>(loc, cond, before->getArguments());
		// Build the while-loop's "after" region.
		rewriter.setInsertionPointToStart(after);
		aartbikUnsubmitted Done Reply Inline Actions you cannot just pass "indices" to next call and use it here; you will need the IR to load the contents returned through the memref by the getNext() call, using explicit memref::LoadOp ops on the elements in the memref and passing this to the StoreOp aartbik: you cannot just pass "indices" to next call and use it here; you will need the IR to load the…
		rewriter.create<memref::StoreOp>(loc, elemVal, dst, indices);
		// Finish up.
		rewriter.replaceOp(op, dst);
		aartbikUnsubmitted Done Reply Inline Actions You are replacing the op (which returns a tensor) with the result of an alloc (which is a memref). That type mismatch will fail. You need a buffer cast in between. aartbik: You are replacing the op (which returns a tensor) with the result of an alloc (which is a…
		aartbikUnsubmitted Done Reply Inline Actions This replacement will also need some legality check changes. Up to now, we were replacing sparse tensors with opague pointers, and the checks/rewriting did all the work But now we have dense_tensor = convert .... return dense_tenor and the mechanism will need some "love" to make it accept the rewriting even though the types were already legal to start with aartbik: This replacement will also need some legality check changes. Up to now, we were replacing…
		return success();
		aartbikUnsubmitted Done Reply Inline Actions Technically, we also fall into this branch for the strange dense to dense conversion. We will have typically folded those away, but I would not completely rely on this having taken place always and defend against that in the code. aartbik: Technically, we also fall into this branch for the strange dense to dense conversion. We will…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Just to be clear, that should only ever happen when the source is a "sparse" tensor that happens to use dense storage for all dimensions, right? Which is to say, the destination is always a bog-standard dense tensor, right? If so then the code should still work; if not then we'll have to figure out how to tighten up the guards for detecting the different cases. I'll update the commentary (on the assumption that the answer to the first question is yes; of course, if that's true, then shouldn't the "sparse=>sparse" case have the same caveat?) wrengr: Just to be clear, that should only ever happen when the source is a "sparse" tensor that…
		aartbikUnsubmitted Done Reply Inline Actions No, I really means no encoding at all. There is a subtle difference between an un-annotated tensor and a tensor with all-dense annotations. All conversions work for the all-dense annotated case (it is treated as a sort of sparse tensors). But the logic on falling into a branch based on encDst, !encDest, srcDest, !srcDest (so four truth values) fell into this branch for two cases, but you only implemented the sparse->dense. So you will have to add one if-test and return failure. aartbik: No, I really means no encoding at all. There is a subtle difference between an un-annotated…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Got it wrengr: Got it
}		}
// This is a dense => sparse conversion or a sparse constant in COO =>		// This is a dense => sparse conversion or a sparse constant in COO =>
// sparse conversion, which is handled as follows:		// sparse conversion, which is handled as follows:
// t = newSparseCOO()		// t = newSparseCOO()
// ...code to fill the COO tensor t...		// ...code to fill the COO tensor t...
// s = newSparseTensor(t)		// s = newSparseTensor(t)
//		//
		aartbikUnsubmitted Done Reply Inline Actions you still have some block scoping left? none of these really release stuff early to keep memory lower so I would opt for readability over block scoping aartbik: you still have some block scoping left? none of these really release stuff early to keep memory…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Imo the blocks improve readability rather than detract. wrengr: Imo the blocks improve readability rather than detract.
		aartbikUnsubmitted Done Reply Inline Actions Ok, acceptable ;-) aartbik: Ok, acceptable ;-)
// To fill the COO tensor from a dense tensor:		// To fill the COO tensor from a dense tensor:
// for i1 in dim1		// for i1 in dim1
		aartbikUnsubmitted Done Reply Inline Actions shouldn't we bail out if there are dynamic sizes? or, better yet, just add those in this revision aartbik: shouldn't we bail out if there are dynamic sizes? or, better yet, just add those in this…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Done in D112674 wrengr: Done in D112674
// ..		// ..
// for ik in dimk		// for ik in dimk
// val = a[i1,..,ik]		// val = a[i1,..,ik]
// if val != 0		// if val != 0
// t->add(val, [i1,..,ik], [p1,..,pk])		// t->add(val, [i1,..,ik], [p1,..,pk])
//		//
// To fill the COO tensor from a sparse constant in COO format:		// To fill the COO tensor from a sparse constant in COO format:
// for i in range(NNZ)		// for i in range(NNZ)
// val = values[i]		// val = values[i]
// [i1,..,ik] = indices[i]		// [i1,..,ik] = indices[i]
// t->add(val, [i1,..,ik], [p1,..,pk])		// t->add(val, [i1,..,ik], [p1,..,pk])
//		//
// Note that the dense tensor traversal code is actually implemented		// Note that the dense tensor traversal code is actually implemented
// using MLIR IR to avoid having to expose too much low-level		// using MLIR IR to avoid having to expose too much low-level
// memref traversal details to the runtime support library.		// memref traversal details to the runtime support library.
// Also note that the code below only generates the "new" ops and		// Also note that the code below only generates the "new" ops and
// the loop-nest per se; whereas the entire body of the innermost		// the loop-nest per se; whereas the entire body of the innermost
// loop is generated by genAddElt().		// loop is generated by genAddElt().
Location loc = op->getLoc();		Location loc = op->getLoc();
ShapedType shape = resType.cast<ShapedType>();		ShapedType shape = resType.cast<ShapedType>();
auto memTp =
MemRefType::get({ShapedType::kDynamicSize}, rewriter.getIndexType());
Value perm;		Value perm;
Value ptr = genNewCall(rewriter, op, encDst, 2, perm);		Value ptr = genNewCall(rewriter, op, encDst, 2, perm);
Value arg = rewriter.create<ConstantOp>(		Value ind = allocaIndices(rewriter, loc, shape.getRank());
loc, rewriter.getIndexAttr(shape.getRank()));
Value ind = rewriter.create<memref::AllocaOp>(loc, memTp, ValueRange{arg});
SmallVector<Value> lo;		SmallVector<Value> lo;
SmallVector<Value> hi;		SmallVector<Value> hi;
SmallVector<Value> st;		SmallVector<Value> st;
Value zero = rewriter.create<ConstantOp>(loc, rewriter.getIndexAttr(0));		Value zero = rewriter.create<ConstantOp>(loc, rewriter.getIndexAttr(0));
Value one = rewriter.create<ConstantOp>(loc, rewriter.getIndexAttr(1));		Value one = rewriter.create<ConstantOp>(loc, rewriter.getIndexAttr(1));
Value tensor = adaptor.getOperands()[0];		auto indicesValues = genSplitSparseConstant(rewriter, op, src);
auto indicesValues = genSplitSparseConstant(rewriter, op, tensor);
bool isCOOConstant = indicesValues.hasValue();		bool isCOOConstant = indicesValues.hasValue();
Value indices;		Value indices;
Value values;		Value values;
if (isCOOConstant) {		if (isCOOConstant) {
indices = indicesValues->first;		indices = indicesValues->first;
values = indicesValues->second;		values = indicesValues->second;
lo.push_back(zero);		lo.push_back(zero);
hi.push_back(linalg::createOrFoldDimOp(rewriter, loc, values, 0));		hi.push_back(linalg::createOrFoldDimOp(rewriter, loc, values, 0));
st.push_back(one);		st.push_back(one);
} else {		} else {
for (unsigned i = 0, rank = shape.getRank(); i < rank; i++) {		for (unsigned i = 0, rank = shape.getRank(); i < rank; i++) {
lo.push_back(zero);		lo.push_back(zero);
hi.push_back(linalg::createOrFoldDimOp(rewriter, loc, tensor, i));		hi.push_back(linalg::createOrFoldDimOp(rewriter, loc, src, i));
st.push_back(one);		st.push_back(one);
}		}
}		}
Type eltType = shape.getElementType();		Type eltType = shape.getElementType();
unsigned rank = shape.getRank();		unsigned rank = shape.getRank();
scf::buildLoopNest(rewriter, op.getLoc(), lo, hi, st, {},		scf::buildLoopNest(rewriter, op.getLoc(), lo, hi, st, {},
[&](OpBuilder &builder, Location loc, ValueRange ivs,		[&](OpBuilder &builder, Location loc, ValueRange ivs,
ValueRange args) -> scf::ValueVector {		ValueRange args) -> scf::ValueVector {
Value val;		Value val;
if (isCOOConstant)		if (isCOOConstant)
val = genIndexAndValueForSparse(		val = genIndexAndValueForSparse(
rewriter, op, indices, values, ind, ivs, rank);		rewriter, op, indices, values, ind, ivs, rank);
else		else
val = genIndexAndValueForDense(rewriter, op, eltType,		val = genIndexAndValueForDense(rewriter, op, eltType,
tensor, ind, ivs);		src, ind, ivs);
genAddEltCall(rewriter, op, eltType, ptr, val, ind,		genAddEltCall(rewriter, op, eltType, ptr, val, ind,
perm);		perm);
return {};		return {};
});		});
rewriter.replaceOp(op, genNewCall(rewriter, op, encDst, 1, perm, ptr));		rewriter.replaceOp(op, genNewCall(rewriter, op, encDst, 1, perm, ptr));
return success();		return success();
}		}
};		};
▲ Show 20 Lines • Show All 143 Lines • Show Last 20 Lines

mlir/lib/ExecutionEngine/SparseUtils.cpp

Show First 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
public:		public:
SparseTensorCOO(const std::vector<uint64_t> &szs, uint64_t capacity)		SparseTensorCOO(const std::vector<uint64_t> &szs, uint64_t capacity)
: sizes(szs) {		: sizes(szs) {
if (capacity)		if (capacity)
elements.reserve(capacity);		elements.reserve(capacity);
}		}
/// Adds element as indices and value.		/// Adds element as indices and value.
void add(const std::vector<uint64_t> &ind, V val) {		void add(const std::vector<uint64_t> &ind, V val) {
assert(getRank() == ind.size());		assert(getRank() == ind.size());
		aartbikUnsubmitted Done Reply Inline Actions yeah, this is awesome! aartbik: yeah, this is awesome!
for (uint64_t r = 0, rank = getRank(); r < rank; r++)		for (uint64_t r = 0, rank = getRank(); r < rank; r++)
assert(ind[r] < sizes[r]); // within bounds		assert(ind[r] < sizes[r]); // within bounds
elements.emplace_back(ind, val);		elements.emplace_back(ind, val);
}		}
/// Sorts elements lexicographically by index.		/// Sorts elements lexicographically by index.
void sort() { std::sort(elements.begin(), elements.end(), lexOrder); }		void sort() { std::sort(elements.begin(), elements.end(), lexOrder); }
/// Returns rank.		/// Returns rank.
uint64_t getRank() const { return sizes.size(); }		uint64_t getRank() const { return sizes.size(); }
/// Getter for sizes array.		/// Getter for sizes array.
const std::vector<uint64_t> &getSizes() const { return sizes; }		const std::vector<uint64_t> &getSizes() const { return sizes; }
/// Getter for elements array.		/// Getter for elements array.
const std::vector<Element<V>> &getElements() const { return elements; }		const std::vector<Element<V>> &getElements() const { return elements; }

		// Forward declaration of the class required by getIterator. We make
		// it a nexted class so that it can access the private fields.
		bixiaUnsubmitted Done Reply Inline Actions nested bixia: nested
		class Iterator;
		/// Returns an iterator over the elements of a SparseTensorCOO.
		Iterator getIterator() const { return new Iterator(this); }

/// Factory method. Permutes the original dimensions according to		/// Factory method. Permutes the original dimensions according to
/// the given ordering and expects subsequent add() calls to honor		/// the given ordering and expects subsequent add() calls to honor
/// that same ordering for the given indices. The result is a		/// that same ordering for the given indices. The result is a
/// fully permuted coordinate scheme.		/// fully permuted coordinate scheme.
static SparseTensorCOO<V> *newSparseTensorCOO(uint64_t size,		static SparseTensorCOO<V> *newSparseTensorCOO(uint64_t size,
		aartbikUnsubmitted Done Reply Inline Actions iteratorLocked = false? aartbik: iteratorLocked = false?
const uint64_t *sizes,		const uint64_t *sizes,
const uint64_t *perm,		const uint64_t *perm,
uint64_t capacity = 0) {		uint64_t capacity = 0) {
std::vector<uint64_t> permsz(size);		std::vector<uint64_t> permsz(size);
for (uint64_t r = 0; r < size; r++)		for (uint64_t r = 0; r < size; r++)
permsz[perm[r]] = sizes[r];		permsz[perm[r]] = sizes[r];
return new SparseTensorCOO<V>(permsz, capacity);		return new SparseTensorCOO<V>(permsz, capacity);
}		}

private:		private:
/// Returns true if indices of e1 < indices of e2.		/// Returns true if indices of e1 < indices of e2.
static bool lexOrder(const Element<V> &e1, const Element<V> &e2) {		static bool lexOrder(const Element<V> &e1, const Element<V> &e2) {
assert(e1.indices.size() == e2.indices.size());		assert(e1.indices.size() == e2.indices.size());
for (uint64_t r = 0, rank = e1.indices.size(); r < rank; r++) {		for (uint64_t r = 0, rank = e1.indices.size(); r < rank; r++) {
if (e1.indices[r] == e2.indices[r])		if (e1.indices[r] == e2.indices[r])
continue;		continue;
return e1.indices[r] < e2.indices[r];		return e1.indices[r] < e2.indices[r];
}		}
return false;		return false;
}		}
std::vector<uint64_t> sizes; // per-rank dimension sizes		const std::vector<uint64_t> sizes; // per-rank dimension sizes
std::vector<Element<V>> elements;		std::vector<Element<V>> elements;
};		};

		/// This iterator is specifically designed for the needs of the MLIR
		/// generated for sparse=>dense conversion; hence why it is so
		aartbikUnsubmitted Done Reply Inline Actions this sentence does not flow quite well, or am I reading it wrong? aartbik: this sentence does not flow quite well, or am I reading it wrong?
		wrengrAuthorUnsubmitted Done Reply Inline Actions I'll try rewording wrengr: I'll try rewording
		/// idiosyncratic compared to a more conventional iterator for use within
		/// C++ itself.
		template <typename V>
		class SparseTensorCOO<V>::Iterator {
		// TODO(wrengr): really this class should be a thin wrapper/subclass
		aartbikUnsubmitted Done Reply Inline Actions this class adds an enormous amount of code for a very thin iterator how about just having the very simple start()/next() iteration inside the COO class itself? That way, we can also assert an error if you try to insert while iterating aartbik: this class adds an enormous amount of code for a very thin iterator how about just having the…
		// of the std::vector, rather than needing to do a dereference every
		// time a method is called; but we don't want to actually copy the whole
		// contents of the underlying array(s) when this class is initialized.
		// Maybe we should be a thin wrapper/subclass of SparseTensorCOO?
		// Or have a variant of SparseTensorStorage::toCOO() to construct this
		// iterator directly?
		const std::vector<Element<V>> &elements;
		unsigned pos;

		public:
		// TODO(wrengr): to guarantee safety we'd either need to consume the
		// SparseTensorCOO (e.g., requiring an rvalue-reference) or get notified
		// somehow whenever the SparseTensorCOO adds new elements, sorts, etc.
		Iterator(const SparseTensorCOO<V> &coo) : elements(coo.elements), pos(0) {}
		// TODO(wrengr): could combine these into one if we returned a pointer
		// instead of a reference (so that we can return `nullptr` when OOB).
		bool hasNext() const { return pos < elements.size(); }
		const Element<V> &getNext() {
		assert(hasNext());
		return elements[pos++];
		}
		};

/// Abstract base class of sparse tensor storage. Note that we use		/// Abstract base class of sparse tensor storage. Note that we use
/// function overloading to implement "partial" method specialization.		/// function overloading to implement "partial" method specialization.
class SparseTensorStorageBase {		class SparseTensorStorageBase {
public:		public:
enum DimLevelType : uint8_t { kDense = 0, kCompressed = 1, kSingleton = 2 };		enum DimLevelType : uint8_t { kDense = 0, kCompressed = 1, kSingleton = 2 };

virtual uint64_t getDimSize(uint64_t) = 0;		virtual uint64_t getDimSize(uint64_t) = 0;

▲ Show 20 Lines • Show All 365 Lines • ▼ Show 20 Lines	if (ptrTp == (p) && indTp == (i) && valTp == (v)) { \
SparseTensorCOO<V> *tensor = nullptr; \		SparseTensorCOO<V> *tensor = nullptr; \
if (action == 0) \		if (action == 0) \
tensor = \		tensor = \
openSparseTensorCOO<V>(static_cast<char *>(ptr), size, sizes, perm); \		openSparseTensorCOO<V>(static_cast<char *>(ptr), size, sizes, perm); \
else if (action == 1) \		else if (action == 1) \
tensor = static_cast<SparseTensorCOO<V> *>(ptr); \		tensor = static_cast<SparseTensorCOO<V> *>(ptr); \
else if (action == 2) \		else if (action == 2) \
return SparseTensorCOO<V>::newSparseTensorCOO(size, sizes, perm); \		return SparseTensorCOO<V>::newSparseTensorCOO(size, sizes, perm); \
else \		else { \
return static_cast<SparseTensorStorage<P, I, V> *>(ptr)->toCOO(perm); \		tensor = static_cast<SparseTensorStorage<P, I, V> *>(ptr)->toCOO(perm); \
		aartbikUnsubmitted Done Reply Inline Actions assert(action == kEmpty); disappeared in your rewrite aartbik: assert(action == kEmpty); disappeared in your rewrite
		aartbikUnsubmitted Done Reply Inline Actions this is of course inside a macro now, but LLVM wants braces {} on all branches if one of them is branched aartbik: this is of course inside a macro now, but LLVM wants braces {} on all branches if one of them…
		if (action == 3) \
		return tensor; \
		return tensor->getIterator(); \
		} \
return SparseTensorStorage<P, I, V>::newSparseTensor(tensor, sparsity, \		return SparseTensorStorage<P, I, V>::newSparseTensor(tensor, sparsity, \
perm); \		perm); \
}		}

#define IMPL1(NAME, TYPE, LIB) \		#define IMPL1(NAME, TYPE, LIB) \
void _mlir_ciface_##NAME(StridedMemRefType<TYPE, 1> ref, void tensor) { \		void _mlir_ciface_##NAME(StridedMemRefType<TYPE, 1> ref, void tensor) { \
std::vector<TYPE> *v; \		std::vector<TYPE> *v; \
static_cast<SparseTensorStorageBase *>(tensor)->LIB(&v); \		static_cast<SparseTensorStorageBase *>(tensor)->LIB(&v); \
Show All 25 Lines	void _mlir_ciface_##NAME(void tensor, TYPE value, \
uint64_t isize = iref->sizes[0]; \		uint64_t isize = iref->sizes[0]; \
std::vector<uint64_t> indices(isize); \		std::vector<uint64_t> indices(isize); \
for (uint64_t r = 0; r < isize; r++) \		for (uint64_t r = 0; r < isize; r++) \
indices[perm[r]] = indx[r]; \		indices[perm[r]] = indx[r]; \
static_cast<SparseTensorCOO<TYPE> *>(tensor)->add(indices, value); \		static_cast<SparseTensorCOO<TYPE> *>(tensor)->add(indices, value); \
return tensor; \		return tensor; \
}		}

		// TODO(wrengr): Why do we need to handle the permutation manually
		// here? Why doesn't the SparseTensorCOO type handle it itself when constructed?
		aartbikUnsubmitted Done Reply Inline Actions We really should not need permutation here. If you call toCOO with ID permutation, it restores the indices to original order. Note that toCOO takes the permutation of the target, and internally restores permutation, if there was one from source (the internally stored inverse permutation). So if you call toCOO with ID perm, you don't need the perm here anymore, since indices are in the natural "dense" order MLIR expects. aartbik: We really should not need permutation here. If you call toCOO with ID permutation, it restores…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Oh good. I thought that's how things worked, but I wasn't quite certain. wrengr: Oh good. I thought that's how things worked, but I wasn't quite certain.
		aartbikUnsubmitted Done Reply Inline Actions tensor is a bit more consistent name with IMPL3 aartbik: tensor is a bit more consistent name with IMPL3
		/// Calls SparseTensorCOO<V>::Iterator::getNext() with the following semantics.
		/// If there is a next `Element<V>`: the `iref` will be filled from that
		/// element, and the element's value is returned.
		/// If there is no next `Element<V>`: the `iref` will be left in an
		/// indeterminate state (in practice it'll be left unmodified), and
		/// the return value is zero-- which (by definition) is never a valid `V`
		/// for `SparseTensorCOO<V>` to contain, so there's no chance of confusion,
		/// nor any loss of expressivity.
		#define IMPL_COO_GETNEXT(NAME, V) \
		aartbikUnsubmitted Done Reply Inline Actions oh, we will leak memory right now! When iteration is done, we should release the COO tensor/ The easiest way would be to do this if (elem == nullptr) { delete iter; return false; } and document that tensor can no longer be used once getNext returns false aartbik: oh, we will leak memory right now! When iteration is done, we should release the COO tensor/…
		aartbikUnsubmitted Done Reply Inline Actions this is the most important issue, since without this fix, asan will break the test aartbik: this is the most important issue, since without this fix, asan will break the test
		V _mlir_ciface_##NAME(void ptr, StridedMemRefType<uint64_t, 1> iref, \
		StridedMemRefType<uint64_t, 1> *pref) { \
		assert(iref->strides[0] == 1 && pref->strides[0] == 1); \
		assert(iref->sizes[0] == pref->sizes[0]); \
		uint64_t *indx = iref->data + iref->offset; \
		const uint64_t *perm = pref->data + pref->offset; \
		const uint64_t isize = iref->sizes[0]; \
		auto iter = static_cast<SparseTensorCOO<V>::Iterator *>(ptr); \
		if (!(iter->hasNext())) \
		return 0; \
		const Element<V> &elem = iter->getNext(); \
		for (uint64_t r = 0; r < isize; r++) \
		indx[r] = elem.indices[perm[r]]; \
		return elem.value; \
		}

enum OverheadTypeEnum : uint64_t { kU64 = 1, kU32 = 2, kU16 = 3, kU8 = 4 };		enum OverheadTypeEnum : uint64_t { kU64 = 1, kU32 = 2, kU16 = 3, kU8 = 4 };

enum PrimaryTypeEnum : uint64_t {		enum PrimaryTypeEnum : uint64_t {
kF64 = 1,		kF64 = 1,
kF32 = 2,		kF32 = 2,
kI64 = 3,		kI64 = 3,
kI32 = 4,		kI32 = 4,
kI16 = 5,		kI16 = 5,
kI8 = 6		kI8 = 6
};		};

/// Constructs a new sparse tensor. This is the "swiss army knife"		/// Constructs a new sparse tensor. This is the "swiss army knife"
/// method for materializing sparse tensors into the computation.		/// method for materializing sparse tensors into the computation.
/// action		/// action
/// 0 : ptr contains filename to read into storage		/// 0 : ptr contains filename to read into storage
		aartbikUnsubmitted Done Reply Inline Actions if you use a shorter name, perhaps it fits on one line aartbik: if you use a shorter name, perhaps it fits on one line
/// 1 : ptr contains coordinate scheme to assign to new storage		/// 1 : ptr contains coordinate scheme to assign to new storage
/// 2 : returns empty coordinate scheme to fill (call back 1 to setup)		/// 2 : returns empty coordinate scheme to fill (call back 1 to setup)
/// 3 : returns coordinate scheme from storage in ptr (call back 1 to convert)		/// 3 : returns coordinate scheme from storage in ptr (call back 1 to convert)
void *		void *
_mlir_ciface_newSparseTensor(StridedMemRefType<uint8_t, 1> *aref, // NOLINT		_mlir_ciface_newSparseTensor(StridedMemRefType<uint8_t, 1> *aref, // NOLINT
StridedMemRefType<uint64_t, 1> *sref,		StridedMemRefType<uint64_t, 1> *sref,
StridedMemRefType<uint64_t, 1> *pref,		StridedMemRefType<uint64_t, 1> *pref,
uint64_t ptrTp, uint64_t indTp, uint64_t valTp,		uint64_t ptrTp, uint64_t indTp, uint64_t valTp,
▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines
/// Helper to add value to coordinate scheme, one per value type.		/// Helper to add value to coordinate scheme, one per value type.
IMPL3(addEltF64, double)		IMPL3(addEltF64, double)
IMPL3(addEltF32, float)		IMPL3(addEltF32, float)
IMPL3(addEltI64, int64_t)		IMPL3(addEltI64, int64_t)
IMPL3(addEltI32, int32_t)		IMPL3(addEltI32, int32_t)
IMPL3(addEltI16, int16_t)		IMPL3(addEltI16, int16_t)
IMPL3(addEltI8, int8_t)		IMPL3(addEltI8, int8_t)

		IMPL_COO_GETNEXT(getNextF64, double)
		IMPL_COO_GETNEXT(getNextF32, float)
		IMPL_COO_GETNEXT(getNextI64, int64_t)
		IMPL_COO_GETNEXT(getNextI32, int32_t)
		IMPL_COO_GETNEXT(getNextI16, int16_t)
		IMPL_COO_GETNEXT(getNextI8, int8_t)

#undef CASE		#undef CASE
#undef IMPL1		#undef IMPL1
		aartbikUnsubmitted Done Reply Inline Actions feel free to rename this into more intuitive names in a follow up revision too, btw aartbik: feel free to rename this into more intuitive names in a follow up revision too, btw
		wrengrAuthorUnsubmitted Done Reply Inline Actions Will do wrengr: Will do
#undef IMPL2		#undef IMPL2
#undef IMPL3		#undef IMPL3
		#undef IMPL_COO_GETNEXT

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// Public API with methods that accept C-style data structures to interact		// Public API with methods that accept C-style data structures to interact
// with sparse tensors, which are only visible as opaque pointers externally.		// with sparse tensors, which are only visible as opaque pointers externally.
// These methods can be used both by MLIR compiler-generated code as well as by		// These methods can be used both by MLIR compiler-generated code as well as by
// an external runtime that wants to interact with MLIR compiler-generated code.		// an external runtime that wants to interact with MLIR compiler-generated code.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Returns size of sparse tensor in given dimension.		/// Returns size of sparse tensor in given dimension.
		aartbikUnsubmitted Done Reply Inline Actions I prefer not to have this function at all; but if we keep it, it should go to CRunnerUtils.cpp since we have some other printing utilities there but this is not related to sparse aartbik: I prefer not to have this function at all; but if we keep it, it should go to CRunnerUtils.cpp…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Yeah, I'd rather not have this function too. Unfortunately, when I tried using the CRunnerUtils.cpp functions that vector.print does, I could never get it to link right: it either complained about re-defining a symbol, or about using an un-defined symbol. As for implementing it in the integration test itself, I can't seem to find a way to define string values (not attributes) for passing to fputs() Moved to CRunnerUtils.cpp for now. Will try to rip it out in a future differential. wrengr: Yeah, I'd rather not have this function too. Unfortunately, when I tried using the CRunnerUtils.
		aartbikUnsubmitted Done Reply Inline Actions How about just leaving it at the exit(1) for not and think about this in a future differential. Rather than introducing something we want to rip again anyway? aartbik: How about just leaving it at the exit(1) for not and think about this in a future differential.
uint64_t sparseDimSize(void *tensor, uint64_t d) {		uint64_t sparseDimSize(void *tensor, uint64_t d) {
return static_cast<SparseTensorStorageBase *>(tensor)->getDimSize(d);		return static_cast<SparseTensorStorageBase *>(tensor)->getDimSize(d);
}		}

/// Releases sparse tensor storage.		/// Releases sparse tensor storage.
void delSparseTensor(void *tensor) {		void delSparseTensor(void *tensor) {
delete static_cast<SparseTensorStorageBase *>(tensor);		delete static_cast<SparseTensorStorageBase *>(tensor);
}		}
▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse] Implementing sparse=>dense conversion.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 376318

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp

mlir/lib/ExecutionEngine/SparseUtils.cpp

[mlir][sparse] Implementing sparse=>dense conversion.
ClosedPublic