This is an archive of the discontinued LLVM Phabricator instance.

mlir/include/mlir/ExecutionEngine/SparseTensor/Storage.h
865	you want to be defensive here too with an else assert dense (for the future)
917	period at end
mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
453	nice refinement!

This revision is now accepted and ready to land.Oct 5 2022, 1:20 PM

rebase

Harbormaster completed remote builds in B190584: Diff 465535.Oct 5 2022, 1:33 PM

Adding future-proofing, per nits

Harbormaster completed remote builds in B190588: Diff 465539.Oct 5 2022, 1:38 PM

wrengr removed a parent revision: D134926: [mlir][sparse] Factoring out predicates on DimLevelTypes.Oct 5 2022, 1:58 PM

Fixing typo in the new assertions

wrengr removed a parent revision: D134926: [mlir][sparse] Factoring out predicates on DimLevelTypes.Oct 5 2022, 3:13 PM

Harbormaster completed remote builds in B190616: Diff 465572.Oct 5 2022, 3:14 PM

Closed by commit rG1b27484a49ac: [mlir][sparse] further implement singleton dimension level type (authored by wrengr). · Explain WhyOct 5 2022, 4:15 PM

This revision was automatically updated to reflect the committed changes.

wrengr added a commit: rG1b27484a49ac: [mlir][sparse] further implement singleton dimension level type.

wrengr mentioned this in rG933fefb6a834: [mlir][sparse] Adjusting DimLevelType numeric values for faster predicates.Oct 5 2022, 5:40 PM

This change broke the mlir build on Windows in Debug. I'm actually not entirely sure why it only broke in Debug and not in Release as well because I would expect the issue to be the same in both.

What's happening now is SparseTensorConversion.cpp has calls to a couple of functions defined in Enums.h. However, Enums.h is setup as if it belongs to a shared library (but it doesn't, see comment in the CMakelists file when it was created here: https://reviews.llvm.org/D133462) - so when it is included in SparseTensorConversion.cpp the functions are marked as dllimport but there's no library that exports them, so the build fails to link a number of executables. For example:

cmd.exe /C "cd . && "C:\Program Files\CMake\bin\cmake.exe" -E vs_link_exe --intdir=tools\mlir\tools\mlir-opt\CMakeFiles\mlir-opt.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100203~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100203~1.0\x64\mt.exe --manifests  -- C:\PROGRA~1\MIB055~1\2022\PROFES~1\VC\Tools\MSVC\1433~1.316\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\mlir-opt.rsp  /out:bin\mlir-opt.exe /implib:lib\mlir-opt.lib /pdb:bin\mlir-opt.pdb /version:0.0 /machine:x64 /STACK:10000000 /debug /INCREMENTAL /subsystem:console  && cd ."
LINK Pass 1: command "C:\PROGRA~1\MIB055~1\2022\PROFES~1\VC\Tools\MSVC\1433~1.316\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\mlir-opt.rsp /out:bin\mlir-opt.exe /implib:lib\mlir-opt.lib /pdb:bin\mlir-opt.pdb /version:0.0 /machine:x64 /STACK:10000000 /debug /INCREMENTAL /subsystem:console /MANIFEST /MANIFESTFILE:tools\mlir\tools\mlir-opt\CMakeFiles\mlir-opt.dir/intermediate.manifest tools\mlir\tools\mlir-opt\CMakeFiles\mlir-opt.dir/manifest.res" failed (exit code 1120) with the following output:
MLIRSparseTensorTransforms.lib(SparseTensorConversion.cpp.obj) : error LNK2019: unresolved external symbol "bool __cdecl mlir::sparse_tensor::isDenseDLT(enum mlir::sparse_tensor::DimLevelType)" (?isDenseDLT@sparse_tensor@mlir@@YA_NW4DimLevelType@12@@Z) referenced in function "bool __cdecl `anonymous namespace'::canUseDirectConversion(class llvm::ArrayRef<enum mlir::sparse_tensor::SparseTensorEncodingAttr::DimLevelType>)" (?canUseDirectConversion@?A0x0125d971@@YA_NV?$ArrayRef@W4DimLevelType@SparseTensorEncodingAttr@sparse_tensor@mlir@@@llvm@@@Z)
MLIRSparseTensorTransforms.lib(SparseTensorConversion.cpp.obj) : error LNK2019: unresolved external symbol "bool __cdecl mlir::sparse_tensor::isCompressedDLT(enum mlir::sparse_tensor::DimLevelType)" (?isCompressedDLT@sparse_tensor@mlir@@YA_NW4DimLevelType@12@@Z) referenced in function "bool __cdecl `anonymous namespace'::canUseDirectConversion(class llvm::ArrayRef<enum mlir::sparse_tensor::SparseTensorEncodingAttr::DimLevelType>)" (?canUseDirectConversion@?A0x0125d971@@YA_NV?$ArrayRef@W4DimLevelType@SparseTensorEncodingAttr@sparse_tensor@mlir@@@llvm@@@Z)
MLIRSparseTensorTransforms.lib(SparseTensorConversion.cpp.obj) : error LNK2019: unresolved external symbol "bool __cdecl mlir::sparse_tensor::isSingletonDLT(enum mlir::sparse_tensor::DimLevelType)" (?isSingletonDLT@sparse_tensor@mlir@@YA_NW4DimLevelType@12@@Z) referenced in function "bool __cdecl `anonymous namespace'::canUseDirectConversion(class llvm::ArrayRef<enum mlir::sparse_tensor::SparseTensorEncodingAttr::DimLevelType>)" (?canUseDirectConversion@?A0x0125d971@@YA_NV?$ArrayRef@W4DimLevelType@SparseTensorEncodingAttr@sparse_tensor@mlir@@@llvm@@@Z)
bin\mlir-opt.exe : fatal error LNK1120: 3 unresolved externals
ninja: build stopped: subcommand failed.

This can be fixed in a couple of ways:

Assume that mlir_sparse_tensor_utils is a static library (which it is) and then remove any export/import definitions. This should be fairly straightforward. In this case we might want to consider the library naming as well since mlir_xxx_utils are generally shared libraries.
Make mlir_sparse_tensor_utils a shared library and make all the necessary fixes for it to work. One caveat here is that we'd end up with a situation where one of the transform libraries (and as far as I can tell only one of the transform libraries) requires linking to a shared library.

I suppose a third solution would be to move the definitions from Enums.h to the existing MLIRSparseTensorUtils which is also a static library.

Thoughts?

In D134933#3843372, @stella.stamenova wrote:

What's happening now is SparseTensorConversion.cpp has calls to a couple of functions defined in Enums.h. However, Enums.h is setup as if it belongs to a shared library (but it doesn't, see comment in the CMakelists file when it was created here: https://reviews.llvm.org/D133462) - so when it is included in SparseTensorConversion.cpp the functions are marked as dllimport but there's no library that exports them, so the build fails to link a number of executables.

When factoring out the mlir_sparse_tensor_utils library I was having a lot of issues getting things to work under MSVC, and ultimately I could only resolve the issues by making mlir_sparse_tensor_utils static rather than shared. Fwiw, the chain of reasoning leading to the current state is/was: mlir_c_runner_utils is a shared library so the stuff in ExecutionEngine/SparseTensorUtils.h must be marked with dllimport/dllexport, but SparseTensorUtils.h includes ExecutionEngine/SparseTensor/Enums.h and needs to reexport the enum definitions in order for the SparseTensorUtils.h functions to be usable, so the stuff in Enums.h ends up needing to be dllimport/dllexport as well. (And since Enums.h includes ExecutionEngine/Float16bits.h the same thing happens again, though the mlir_float16_utils library can be made a shared lib without complications.)

As I recall, the only thing mlir_sparse_tensor_utils really needs to export as part of its own API are the enum definitions themselves. So I think the quickest fix might be to just remove the MLIR_SPARSETENSOR_EXPORT attribute from all the functions in Enums.h and only leave it on the enums themselves. That does still leave things in an awkward state, but if it makes things green then that'll buy time to come up with a better fix. Though I'm not sure if leaving the MLIR_SPARSETENSOR_EXPORT attribute on the enum definitions would still cause the sorts of problems you're seeing. This is my first time working with DLLs on Windows, so any enlightenment you can share re best practices would be most welcome.

Also, it seems like the LLVM Windows buildbot is a lot stricter than the MLIR/Phabricator Windows buildbot. Is there any way I can send differentials to be run by the LLVM buildbot before landing them?

This can be fixed in a couple of ways:

Assume that mlir_sparse_tensor_utils is a static library (which it is) and then remove any export/import definitions. This should be fairly straightforward. In this case we might want to consider the library naming as well since mlir_xxx_utils are generally shared libraries.

Make mlir_sparse_tensor_utils a shared library and make all the necessary fixes for it to work. One caveat here is that we'd end up with a situation where one of the transform libraries (and as far as I can tell only one of the transform libraries) requires linking to a shared library.

I suppose a third solution would be to move the definitions from Enums.h to the existing MLIRSparseTensorUtils which is also a static library.

Thoughts?

Just as a bit of background: The reason we split mlir_sparse_tensor_utils out from mlir_c_runner_utils is because another team at Google wanted to be able to use the C++ library directly rather than going through the C-API exposed by mlir_c_runner_utils. That team doesn't care whether the library is shared vs static nor care about the Windows side, so whatever solution we can come up with should be fine for their needs. Originally I intended to make it a shared library, following the pattern of all the other mlir_xxx_utils libraries under the ExecutionEngine directory, but kept running into MSVC linker errors. Eventually I learned about dllimport/dllexport, which solved most of the problems— but ultimately had to make the library static because std::vector isn't set up to be used/exposed by DLLs.

TBH, I'm not really sure whether mlir_sparse_tensor_utils "should" be static or dynamic (nor what MLIR's stance is on such things). We definitely want to be able to share the enums between both mlir_c_runner_utils and SparseTensorTransforms, since the transforms need to be able to be able to codegen calls into the runtime library, so both the transforms and the runtime need to agree on the enums. And I'd really like to keep the predicate functions side-by-side with the enums, since that helps ensure they remain in sync as we evolve and change the enum definitions (e.g., as in D135004); though the predicate functions only need to be called by SparseTensorTransforms and mlir_sparse_tensor_utils itself, they don't need to be exported from mlir_c_runner_utils. I suppose we could split Enums.h off into a separate library from the rest of mlir_sparse_tensor_utils, but that doesn't help the latter become a sharedlib since there's still the std::vector issue.

Moving things into MLIRSparseTensorUtils is a no-go, since that's a completely unrelated library. (This is an example of why I hate using "utils" rather than more expressive names :) Though I'm totally cool with renaming mlir_sparse_tensor_utils to avoid the name causing any confusion about whether it's a sharedlib or not.

wrengr mentioned this in D133462: [mlir][sparse] refactoring SparseTensorUtils: (1 of 4) file-splitting.Oct 7 2022, 4:50 PM

wrengr mentioned this in D135502: [mlir][sparse] Removing DLL attributes from ExecutionEngine/SparseTensor/Enums.h.Oct 7 2022, 5:00 PM

I just uploaded D135502 to see if that fixes things. (Apparently enum class definitions don't need the dllexport/dllimport stuff?) So let's continue the conversation on that differential

wrengr mentioned this in rG1aa06aeb1a00: [mlir][sparse] Removing DLL attributes from ExecutionEngine/SparseTensor/Enums.h.Oct 10 2022, 11:22 AM

aartbik mentioned this in D135623: [mlir][sparse] fixed memory leak on sparse tensors.Oct 10 2022, 3:10 PM

aartbik mentioned this in rG8fc63d14c0e0: [mlir][sparse] fixed memory leak on sparse tensors.Oct 10 2022, 3:17 PM

Revision Contents

Path

Size

mlir/

include/

mlir/

ExecutionEngine/

SparseTensor/

Storage.h

34 lines

lib/

Dialect/

SparseTensor/

Transforms/

SparseTensorConversion.cpp

13 lines

ExecutionEngine/

SparseTensor/

NNZ.cpp

35 lines

test/

Dialect/

SparseTensor/

conversion_sparse2sparse.mlir

42 lines

Integration/

Dialect/

SparseTensor/

CPU/

sparse_conversion_sparse2sparse.mlir

95 lines

python/

test_stress.py

5 lines

Diff 464126

mlir/include/mlir/ExecutionEngine/SparseTensor/Storage.h

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
class SparseTensorEnumeratorBase;		class SparseTensorEnumeratorBase;

// These macros ensure consistent error messages, without risk of incuring		// These macros ensure consistent error messages, without risk of incuring
// an additional method call to do so.		// an additional method call to do so.
#define ASSERT_VALID_DIM(d) \		#define ASSERT_VALID_DIM(d) \
assert(d < getRank() && "Dimension index is out of bounds");		assert(d < getRank() && "Dimension index is out of bounds");
#define ASSERT_COMPRESSED_DIM(d) \		#define ASSERT_COMPRESSED_DIM(d) \
assert(isCompressedDim(d) && "Dimension is not compressed");		assert(isCompressedDim(d) && "Dimension is not compressed");
		#define ASSERT_COMPRESSED_OR_SINGLETON_DIM(d) \
		assert((isCompressedDim(d) \|\| isSingletonDim(d)) && \
		"Dimension is neither compressed nor singleton");
#define ASSERT_DENSE_DIM(d) assert(isDenseDim(d) && "Dimension is not dense");		#define ASSERT_DENSE_DIM(d) assert(isDenseDim(d) && "Dimension is not dense");

/// Abstract base class for `SparseTensorStorage<P,I,V>`. This class		/// Abstract base class for `SparseTensorStorage<P,I,V>`. This class
/// takes responsibility for all the `<P,I,V>`-independent aspects		/// takes responsibility for all the `<P,I,V>`-independent aspects
/// of the tensor (e.g., shape, sparsity, permutation). In addition,		/// of the tensor (e.g., shape, sparsity, permutation). In addition,
/// we use function overloading to implement "partial" method		/// we use function overloading to implement "partial" method
/// specialization, which the C-API relies on to catch type errors		/// specialization, which the C-API relies on to catch type errors
/// arising from our use of opaque pointers.		/// arising from our use of opaque pointers.
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	#define DECL_GETPOINTERS(PNAME, P) \
FOREVERY_FIXED_O(DECL_GETPOINTERS)		FOREVERY_FIXED_O(DECL_GETPOINTERS)
#undef DECL_GETPOINTERS		#undef DECL_GETPOINTERS

/// Indices-overhead storage.		/// Indices-overhead storage.
#define DECL_GETINDICES(INAME, I) \		#define DECL_GETINDICES(INAME, I) \
virtual void getIndices(std::vector<I> **, uint64_t);		virtual void getIndices(std::vector<I> **, uint64_t);
FOREVERY_FIXED_O(DECL_GETINDICES)		FOREVERY_FIXED_O(DECL_GETINDICES)
#undef DECL_GETINDICES		#undef DECL_GETINDICES
		virtual uint64_t getIndex(uint64_t d, uint64_t pos) const = 0;

/// Primary storage.		/// Primary storage.
#define DECL_GETVALUES(VNAME, V) virtual void getValues(std::vector<V> **);		#define DECL_GETVALUES(VNAME, V) virtual void getValues(std::vector<V> **);
FOREVERY_V(DECL_GETVALUES)		FOREVERY_V(DECL_GETVALUES)
#undef DECL_GETVALUES		#undef DECL_GETVALUES

/// Element-wise insertion in lexicographic index order.		/// Element-wise insertion in lexicographic index order.
#define DECL_LEXINSERT(VNAME, V) virtual void lexInsert(const uint64_t *, V);		#define DECL_LEXINSERT(VNAME, V) virtual void lexInsert(const uint64_t *, V);
▲ Show 20 Lines • Show All 93 Lines • ▼ Show 20 Lines	void getPointers(std::vector<P> **out, uint64_t d) final {
*out = &pointers[d];		*out = &pointers[d];
}		}
void getIndices(std::vector<I> **out, uint64_t d) final {		void getIndices(std::vector<I> **out, uint64_t d) final {
ASSERT_VALID_DIM(d);		ASSERT_VALID_DIM(d);
*out = &indices[d];		*out = &indices[d];
}		}
void getValues(std::vector<V> *out) final { out = &values; }		void getValues(std::vector<V> *out) final { out = &values; }

		uint64_t getIndex(uint64_t d, uint64_t pos) const final {
		ASSERT_COMPRESSED_OR_SINGLETON_DIM(d);
		assert(pos < indices[d].size() && "Index position is out of bounds");
		return indices[d][pos]; // Converts the stored `I` into `uint64_t`.
		}

/// Partially specialize lexicographical insertions based on template types.		/// Partially specialize lexicographical insertions based on template types.
void lexInsert(const uint64_t *cursor, V val) final {		void lexInsert(const uint64_t *cursor, V val) final {
// First, wrap up pending insertion path.		// First, wrap up pending insertion path.
uint64_t diff = 0;		uint64_t diff = 0;
uint64_t top = 0;		uint64_t top = 0;
if (!values.empty()) {		if (!values.empty()) {
diff = lexDiff(cursor);		diff = lexDiff(cursor);
endPath(diff + 1);		endPath(diff + 1);
▲ Show 20 Lines • Show All 106 Lines • ▼ Show 20 Lines	void appendIndex(uint64_t d, uint64_t full, uint64_t i) {
}		}
}		}

/// Writes the given coordinate to `indices[d][pos]`. This method		/// Writes the given coordinate to `indices[d][pos]`. This method
/// checks that `i` is representable in the `I` type; however, it		/// checks that `i` is representable in the `I` type; however, it
/// does not check that `i` is semantically valid (i.e., in bounds		/// does not check that `i` is semantically valid (i.e., in bounds
/// for `dimSizes[d]` and not elsewhere occurring in the same segment).		/// for `dimSizes[d]` and not elsewhere occurring in the same segment).
void writeIndex(uint64_t d, uint64_t pos, uint64_t i) {		void writeIndex(uint64_t d, uint64_t pos, uint64_t i) {
ASSERT_COMPRESSED_DIM(d);		ASSERT_COMPRESSED_OR_SINGLETON_DIM(d);
// Subscript assignment to `std::vector` requires that the `pos`-th		// Subscript assignment to `std::vector` requires that the `pos`-th
// entry has been initialized; thus we must be sure to check `size()`		// entry has been initialized; thus we must be sure to check `size()`
// here, instead of `capacity()` as would be ideal.		// here, instead of `capacity()` as would be ideal.
assert(pos < indices[d].size() && "Index position is out of bounds");		assert(pos < indices[d].size() && "Index position is out of bounds");
assert(i <= std::numeric_limits<I>::max() &&		assert(i <= std::numeric_limits<I>::max() &&
"Index value is too large for the I-type");		"Index value is too large for the I-type");
indices[d][pos] = static_cast<I>(i);		indices[d][pos] = static_cast<I>(i);
}		}

/// Computes the assembled-size associated with the `d`-th dimension,		/// Computes the assembled-size associated with the `d`-th dimension,
/// given the assembled-size associated with the `(d-1)`-th dimension.		/// given the assembled-size associated with the `(d-1)`-th dimension.
/// "Assembled-sizes" correspond to the (nominal) sizes of overhead		/// "Assembled-sizes" correspond to the (nominal) sizes of overhead
/// storage, as opposed to "dimension-sizes" which are the cardinality		/// storage, as opposed to "dimension-sizes" which are the cardinality
/// of coordinates for that dimension.		/// of coordinates for that dimension.
///		///
/// Precondition: the `pointers[d]` array must be fully initialized		/// Precondition: the `pointers[d]` array must be fully initialized
/// before calling this method.		/// before calling this method.
uint64_t assembledSize(uint64_t parentSz, uint64_t d) const {		uint64_t assembledSize(uint64_t parentSz, uint64_t d) const {
if (isCompressedDim(d))		if (isCompressedDim(d))
return pointers[d][parentSz];		return pointers[d][parentSz];
// else if dense:		if (isSingletonDim(d))
		return parentSz; // New size is same as the parent.
		if (isDenseDim(d))
return parentSz * getDimSizes()[d];		return parentSz * getDimSizes()[d];
		MLIR_SPARSETENSOR_FATAL("unsupported dimension level type");
}		}

/// Initializes sparse tensor storage scheme from a memory-resident sparse		/// Initializes sparse tensor storage scheme from a memory-resident sparse
/// tensor in coordinate scheme. This method prepares the pointers and		/// tensor in coordinate scheme. This method prepares the pointers and
/// indices arrays under the given per-dimension dense/sparse annotations.		/// indices arrays under the given per-dimension dense/sparse annotations.
///		///
/// Preconditions:		/// Preconditions:
/// (1) the `elements` must be lexicographically sorted.		/// (1) the `elements` must be lexicographically sorted.
Show All 31 Lines	private:

/// Finalize the sparse pointer structure at this dimension.		/// Finalize the sparse pointer structure at this dimension.
void finalizeSegment(uint64_t d, uint64_t full = 0, uint64_t count = 1) {		void finalizeSegment(uint64_t d, uint64_t full = 0, uint64_t count = 1) {
if (count == 0)		if (count == 0)
return; // Short-circuit, since it'll be a nop.		return; // Short-circuit, since it'll be a nop.
if (isCompressedDim(d)) {		if (isCompressedDim(d)) {
appendPointer(d, indices[d].size(), count);		appendPointer(d, indices[d].size(), count);
} else if (isSingletonDim(d)) {		} else if (isSingletonDim(d)) {
return;		return; // Nothing to finalize.
} else { // Dense dimension.		} else { // Dense dimension.
ASSERT_DENSE_DIM(d);		ASSERT_DENSE_DIM(d);
const uint64_t sz = getDimSizes()[d];		const uint64_t sz = getDimSizes()[d];
assert(sz >= full && "Segment is overfull");		assert(sz >= full && "Segment is overfull");
count = detail::checkedMul(count, sz - full);		count = detail::checkedMul(count, sz - full);
// For dense storage we must enumerate all the remaining coordinates		// For dense storage we must enumerate all the remaining coordinates
// in this dimension (i.e., coordinates after the last non-zero		// in this dimension (i.e., coordinates after the last non-zero
// element), and either fill in their zero values or else recurse		// element), and either fill in their zero values or else recurse
Show All 12 Lines	void endPath(uint64_t diff) {
for (uint64_t i = 0; i < rank - diff; ++i) {		for (uint64_t i = 0; i < rank - diff; ++i) {
const uint64_t d = rank - i - 1;		const uint64_t d = rank - i - 1;
finalizeSegment(d, idx[d] + 1);		finalizeSegment(d, idx[d] + 1);
}		}
}		}

/// Continues a single insertion path, outer to inner.		/// Continues a single insertion path, outer to inner.
void insPath(const uint64_t *cursor, uint64_t diff, uint64_t top, V val) {		void insPath(const uint64_t *cursor, uint64_t diff, uint64_t top, V val) {
ASSERT_VALID_DIM(diff);
const uint64_t rank = getRank();		const uint64_t rank = getRank();
		assert(diff <= rank && "Dimension-diff is out of bounds");
for (uint64_t d = diff; d < rank; ++d) {		for (uint64_t d = diff; d < rank; ++d) {
const uint64_t i = cursor[d];		const uint64_t i = cursor[d];
appendIndex(d, top, i);		appendIndex(d, top, i);
top = 0;		top = 0;
idx[d] = i;		idx[d] = i;
}		}
values.push_back(val);		values.push_back(val);
}		}
Show All 16 Lines	private:
friend class SparseTensorEnumerator<P, I, V>;		friend class SparseTensorEnumerator<P, I, V>;

std::vector<std::vector<P>> pointers;		std::vector<std::vector<P>> pointers;
std::vector<std::vector<I>> indices;		std::vector<std::vector<I>> indices;
std::vector<V> values;		std::vector<V> values;
std::vector<uint64_t> idx; // index cursor for lexicographic insertion.		std::vector<uint64_t> idx; // index cursor for lexicographic insertion.
};		};

		#undef ASSERT_COMPRESSED_OR_SINGLETON_DIM
#undef ASSERT_COMPRESSED_DIM		#undef ASSERT_COMPRESSED_DIM
#undef ASSERT_VALID_DIM		#undef ASSERT_VALID_DIM

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
/// A (higher-order) function object for enumerating the elements of some		/// A (higher-order) function object for enumerating the elements of some
/// `SparseTensorStorage` under a permutation. That is, the `forallElements`		/// `SparseTensorStorage` under a permutation. That is, the `forallElements`
/// method encapsulates the loop-nest for enumerating the elements of		/// method encapsulates the loop-nest for enumerating the elements of
/// the source tensor (in whatever order is best for the source tensor),		/// the source tensor (in whatever order is best for the source tensor),
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	if (d == Base::getRank()) {
const std::vector<I> &indicesD = src.indices[d];		const std::vector<I> &indicesD = src.indices[d];
assert(pstop <= indicesD.size() && "Index position is out of bounds");		assert(pstop <= indicesD.size() && "Index position is out of bounds");
uint64_t &cursorReordD = this->cursor[this->reord[d]];		uint64_t &cursorReordD = this->cursor[this->reord[d]];
for (uint64_t pos = pstart; pos < pstop; ++pos) {		for (uint64_t pos = pstart; pos < pstop; ++pos) {
cursorReordD = static_cast<uint64_t>(indicesD[pos]);		cursorReordD = static_cast<uint64_t>(indicesD[pos]);
forallElements(yield, pos, d + 1);		forallElements(yield, pos, d + 1);
}		}
} else if (src.isSingletonDim(d)) {		} else if (src.isSingletonDim(d)) {
MLIR_SPARSETENSOR_FATAL("unsupported dimension level type");		this->cursor[this->reord[d]] = src.getIndex(d, parentPos);
		forallElements(yield, parentPos, d + 1);
} else { // Dense dimension.		} else { // Dense dimension.
assert(src.isDenseDim(d)); // TODO: reuse the ASSERT_DENSE_DIM message		assert(src.isDenseDim(d)); // TODO: reuse the ASSERT_DENSE_DIM message
const uint64_t sz = src.getDimSizes()[d];		const uint64_t sz = src.getDimSizes()[d];
const uint64_t pstart = parentPos * sz;		const uint64_t pstart = parentPos * sz;
uint64_t &cursorReordD = this->cursor[this->reord[d]];		uint64_t &cursorReordD = this->cursor[this->reord[d]];
for (uint64_t i = 0; i < sz; ++i) {		for (uint64_t i = 0; i < sz; ++i) {
cursorReordD = i;		cursorReordD = i;
forallElements(yield, pstart + i, d + 1);		forallElements(yield, pstart + i, d + 1);
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	SparseTensorStorage<P, I, V> *SparseTensorStorage<P, I, V>::newSparseTensor(
const DimLevelType sparsity, SparseTensorCOO<V> coo) {		const DimLevelType sparsity, SparseTensorCOO<V> coo) {
if (coo) {		if (coo) {
const auto &coosz = coo->getDimSizes();		const auto &coosz = coo->getDimSizes();
#ifndef NDEBUG		#ifndef NDEBUG
detail::assertPermutedSizesMatchShape(coosz, rank, perm, shape);		detail::assertPermutedSizesMatchShape(coosz, rank, perm, shape);
#endif		#endif
return new SparseTensorStorage<P, I, V>(coosz, perm, sparsity, coo);		return new SparseTensorStorage<P, I, V>(coosz, perm, sparsity, coo);
}		}
// else
std::vector<uint64_t> permsz(rank);		std::vector<uint64_t> permsz(rank);
for (uint64_t r = 0; r < rank; ++r) {		for (uint64_t r = 0; r < rank; ++r) {
assert(shape[r] > 0 && "Dimension size zero has trivial storage");		assert(shape[r] > 0 && "Dimension size zero has trivial storage");
permsz[perm[r]] = shape[r];		permsz[perm[r]] = shape[r];
}		}
// We pass the null `coo` to ensure we select the intended constructor.		// We pass the null `coo` to ensure we select the intended constructor.
return new SparseTensorStorage<P, I, V>(permsz, perm, sparsity, coo);		return new SparseTensorStorage<P, I, V>(permsz, perm, sparsity, coo);
}		}
▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	for (uint64_t rank = getRank(), r = 0; r < rank; ++r) {
}		}
// Update assembled-size for the next iteration.		// Update assembled-size for the next iteration.
parentSz = assembledSize(parentSz, r);		parentSz = assembledSize(parentSz, r);
// Ideally we need only `indices[r].reserve(parentSz)`, however		// Ideally we need only `indices[r].reserve(parentSz)`, however
// the `std::vector` implementation forces us to initialize it too.		// the `std::vector` implementation forces us to initialize it too.
// That is, in the yieldPos loop we need random-access assignment		// That is, in the yieldPos loop we need random-access assignment
// to `indices[r]`; however, `std::vector`'s subscript-assignment		// to `indices[r]`; however, `std::vector`'s subscript-assignment
// only allows assigning to already-initialized positions.		// only allows assigning to already-initialized positions.
if (isCompressedDim(r))		if (isCompressedDim(r) \|\| isSingletonDim(r))
		aartbikUnsubmitted Done Reply Inline Actions you want to be defensive here too with an else assert dense (for the future) aartbik: you want to be defensive here too with an else assert dense (for the future)
indices[r].resize(parentSz, 0);		indices[r].resize(parentSz, 0);
}		}
values.resize(parentSz, 0); // Both allocate and zero-initialize.		values.resize(parentSz, 0); // Both allocate and zero-initialize.
}		}
// The yieldPos loop		// The yieldPos loop
enumerator->forallElements([this](const std::vector<uint64_t> &ind, V val) {		enumerator->forallElements([this](const std::vector<uint64_t> &ind, V val) {
uint64_t parentSz = 1, parentPos = 0;		uint64_t parentSz = 1, parentPos = 0;
for (uint64_t rank = getRank(), r = 0; r < rank; ++r) {		for (uint64_t rank = getRank(), r = 0; r < rank; ++r) {
if (isCompressedDim(r)) {		if (isCompressedDim(r)) {
// If `parentPos == parentSz` then it's valid as an array-lookup;		// If `parentPos == parentSz` then it's valid as an array-lookup;
// however, it's semantically invalid here since that entry		// however, it's semantically invalid here since that entry
// does not represent a segment of `indices[r]`. Moreover, that		// does not represent a segment of `indices[r]`. Moreover, that
// entry must be immutable for `assembledSize` to remain valid.		// entry must be immutable for `assembledSize` to remain valid.
assert(parentPos < parentSz && "Pointers position is out of bounds");		assert(parentPos < parentSz && "Pointers position is out of bounds");
const uint64_t currentPos = pointers[r][parentPos];		const uint64_t currentPos = pointers[r][parentPos];
// This increment won't overflow the `P` type, since it can't		// This increment won't overflow the `P` type, since it can't
// exceed the original value of `pointers[r][parentPos+1]`		// exceed the original value of `pointers[r][parentPos+1]`
// which was already verified to be within bounds for `P`		// which was already verified to be within bounds for `P`
// when it was written to the array.		// when it was written to the array.
pointers[r][parentPos]++;		pointers[r][parentPos]++;
writeIndex(r, currentPos, ind[r]);		writeIndex(r, currentPos, ind[r]);
parentPos = currentPos;		parentPos = currentPos;
} else if (isSingletonDim(r)) {		} else if (isSingletonDim(r)) {
		writeIndex(r, parentPos, ind[r]);
// the new parentPos equals the old parentPos.		// the new parentPos equals the old parentPos.
} else { // Dense dimension.		} else { // Dense dimension.
ASSERT_DENSE_DIM(r);		ASSERT_DENSE_DIM(r);
parentPos = parentPos * getDimSizes()[r] + ind[r];		parentPos = parentPos * getDimSizes()[r] + ind[r];
}		}
parentSz = assembledSize(parentSz, r);		parentSz = assembledSize(parentSz, r);
}		}
assert(parentPos < values.size() && "Value position is out of bounds");		assert(parentPos < values.size() && "Value position is out of bounds");
Show All 11 Lines	if (isCompressedDim(r)) {
"Pointers got corrupted");		"Pointers got corrupted");
// TODO: optimize this by using `memmove` or similar.		// TODO: optimize this by using `memmove` or similar.
for (uint64_t n = 0; n < parentSz; ++n) {		for (uint64_t n = 0; n < parentSz; ++n) {
const uint64_t parentPos = parentSz - n;		const uint64_t parentPos = parentSz - n;
pointers[r][parentPos] = pointers[r][parentPos - 1];		pointers[r][parentPos] = pointers[r][parentPos - 1];
}		}
pointers[r][0] = 0;		pointers[r][0] = 0;
}		}
		// Both dense and singleton are no-ops for the finalizeYieldPos loop
		aartbikUnsubmitted Done Reply Inline Actions period at end aartbik: period at end
parentSz = assembledSize(parentSz, r);		parentSz = assembledSize(parentSz, r);
}		}
}		}

		#undef ASSERT_DENSE_DIM

} // namespace sparse_tensor		} // namespace sparse_tensor
} // namespace mlir		} // namespace mlir

#undef ASSERT_DENSE_DIM		#undef ASSERT_DENSE_DIM

#endif // MLIR_EXECUTIONENGINE_SPARSETENSOR_STORAGE_H		#endif // MLIR_EXECUTIONENGINE_SPARSETENSOR_STORAGE_H

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp

	Show First 20 Lines • Show All 443 Lines • ▼ Show 20 Lines
	}			}

	/// Determine if the runtime library supports direct conversion to the			/// Determine if the runtime library supports direct conversion to the
	/// given target `dimTypes`.			/// given target `dimTypes`.
	static bool canUseDirectConversion(			static bool canUseDirectConversion(
	ArrayRef<SparseTensorEncodingAttr::DimLevelType> dimTypes) {			ArrayRef<SparseTensorEncodingAttr::DimLevelType> dimTypes) {
	bool alreadyCompressed = false;			bool alreadyCompressed = false;
	for (uint64_t rank = dimTypes.size(), r = 0; r < rank; r++) {			for (uint64_t rank = dimTypes.size(), r = 0; r < rank; r++) {
	switch (dimTypes[r]) {			const DimLevelType dlt = dimLevelTypeEncoding(dimTypes[r]);
	case SparseTensorEncodingAttr::DimLevelType::Compressed:			if (isCompressedDLT(dlt)) {
				aartbikUnsubmitted Done Reply Inline Actions nice refinement! aartbik: nice refinement!
	if (alreadyCompressed)			if (alreadyCompressed)
	return false; // Multiple compressed dimensions not yet supported.			return false; // Multiple compressed dimensions not yet supported.
	alreadyCompressed = true;			alreadyCompressed = true;
	break;			} else if (isDenseDLT(dlt)) {
	case SparseTensorEncodingAttr::DimLevelType::Dense:
	if (alreadyCompressed)			if (alreadyCompressed)
	return false; // Dense after Compressed not yet supported.			return false; // Dense after Compressed not yet supported.
	break;			} else if (isSingletonDLT(dlt)) {
	default: // TODO: investigate			// Direct conversion doesn't have any particular problems with
				// singleton after compressed.
				} else { // TODO: investigate
	return false;			return false;
	}			}
	}			}
	return true;			return true;
	}			}

	/// Helper method to translate indices during a reshaping operation.			/// Helper method to translate indices during a reshaping operation.
	/// TODO: provide as general utility to MLIR at large?			/// TODO: provide as general utility to MLIR at large?
	▲ Show 20 Lines • Show All 1,045 Lines • Show Last 20 Lines

mlir/lib/ExecutionEngine/SparseTensor/NNZ.cpp

	Show All 25 Lines
	/// does not actually populate the statistics, however; for that see			/// does not actually populate the statistics, however; for that see
	/// `initialize`.			/// `initialize`.
	///			///
	/// Precondition: `dimSizes` must not contain zeros.			/// Precondition: `dimSizes` must not contain zeros.
	SparseTensorNNZ::SparseTensorNNZ(const std::vector<uint64_t> &dimSizes,			SparseTensorNNZ::SparseTensorNNZ(const std::vector<uint64_t> &dimSizes,
	const std::vector<DimLevelType> &sparsity)			const std::vector<DimLevelType> &sparsity)
	: dimSizes(dimSizes), dimTypes(sparsity), nnz(getRank()) {			: dimSizes(dimSizes), dimTypes(sparsity), nnz(getRank()) {
	assert(dimSizes.size() == dimTypes.size() && "Rank mismatch");			assert(dimSizes.size() == dimTypes.size() && "Rank mismatch");
	bool uncompressed = true;			bool alreadyCompressed = false;
	(void)uncompressed;			(void)alreadyCompressed;
	uint64_t sz = 1; // the product of all `dimSizes` strictly less than `r`.			uint64_t sz = 1; // the product of all `dimSizes` strictly less than `r`.
	for (uint64_t rank = getRank(), r = 0; r < rank; r++) {			for (uint64_t rank = getRank(), r = 0; r < rank; r++) {
	switch (dimTypes[r]) {			const DimLevelType dlt = sparsity[r];
	case DimLevelType::kCompressed:			if (isCompressedDLT(dlt)) {
	assert(uncompressed &&			if (alreadyCompressed)
				MLIR_SPARSETENSOR_FATAL(
	"Multiple compressed layers not currently supported");			"Multiple compressed layers not currently supported");
	uncompressed = false;			alreadyCompressed = true;
	nnz[r].resize(sz, 0); // Both allocate and zero-initialize.			nnz[r].resize(sz, 0); // Both allocate and zero-initialize.
	break;			} else if (isDenseDLT(dlt)) {
	case DimLevelType::kDense:			if (alreadyCompressed)
	assert(uncompressed && "Dense after compressed not currently supported");			MLIR_SPARSETENSOR_FATAL(
	break;			"Dense after compressed not currently supported");
	case DimLevelType::kSingleton:			} else if (isSingletonDLT(dlt)) {
	// Singleton after Compressed causes no problems for allocating			// Singleton after Compressed causes no problems for allocating
	// `nnz` nor for the yieldPos loop. This remains true even			// `nnz` nor for the yieldPos loop. This remains true even
	// when adding support for multiple compressed dimensions or			// when adding support for multiple compressed dimensions or
	// for dense-after-compressed.			// for dense-after-compressed.
	break;			} else {
	default:			MLIR_SPARSETENSOR_FATAL("unsupported dimension level type: %d\n",
	MLIR_SPARSETENSOR_FATAL("unsupported dimension level type");			static_cast<uint8_t>(dlt));
	}			}
	sz = detail::checkedMul(sz, dimSizes[r]);			sz = detail::checkedMul(sz, dimSizes[r]);
	}			}
	}			}

	/// Lexicographically enumerates all indicies for dimensions strictly			/// Lexicographically enumerates all indicies for dimensions strictly
	/// less than `stopDim`, and passes their nnz statistic to the callback.			/// less than `stopDim`, and passes their nnz statistic to the callback.
	/// Since our use-case only requires the statistic not the coordinates			/// Since our use-case only requires the statistic not the coordinates
	/// themselves, we do not bother to construct those coordinates.			/// themselves, we do not bother to construct those coordinates.
	void SparseTensorNNZ::forallIndices(uint64_t stopDim,			void SparseTensorNNZ::forallIndices(uint64_t stopDim,
	SparseTensorNNZ::NNZConsumer yield) const {			SparseTensorNNZ::NNZConsumer yield) const {
	assert(stopDim < getRank() && "Dimension out of bounds");			assert(stopDim < getRank() && "Dimension out of bounds");
	assert(dimTypes[stopDim] == DimLevelType::kCompressed &&			assert(isCompressedDLT(dimTypes[stopDim]) &&
	"Cannot look up non-compressed dimensions");			"Cannot look up non-compressed dimensions");
	forallIndices(yield, stopDim, 0, 0);			forallIndices(yield, stopDim, 0, 0);
	}			}

	/// Adds a new element (i.e., increment its statistics). We use			/// Adds a new element (i.e., increment its statistics). We use
	/// a method rather than inlining into the lambda in `initialize`,			/// a method rather than inlining into the lambda in `initialize`,
	/// to avoid spurious templating over `V`. And this method is private			/// to avoid spurious templating over `V`. And this method is private
	/// to avoid needing to re-assert validity of `ind` (which is guaranteed			/// to avoid needing to re-assert validity of `ind` (which is guaranteed
	/// by `forallElements`).			/// by `forallElements`).
	void SparseTensorNNZ::add(const std::vector<uint64_t> &ind) {			void SparseTensorNNZ::add(const std::vector<uint64_t> &ind) {
	uint64_t parentPos = 0;			uint64_t parentPos = 0;
	for (uint64_t rank = getRank(), r = 0; r < rank; ++r) {			for (uint64_t rank = getRank(), r = 0; r < rank; ++r) {
	if (dimTypes[r] == DimLevelType::kCompressed)			if (isCompressedDLT(dimTypes[r]))
	nnz[r][parentPos]++;			nnz[r][parentPos]++;
	parentPos = parentPos * dimSizes[r] + ind[r];			parentPos = parentPos * dimSizes[r] + ind[r];
	}			}
	}			}

	/// Recursive component of the public `forallIndices`.			/// Recursive component of the public `forallIndices`.
	void SparseTensorNNZ::forallIndices(SparseTensorNNZ::NNZConsumer yield,			void SparseTensorNNZ::forallIndices(SparseTensorNNZ::NNZConsumer yield,
	uint64_t stopDim, uint64_t parentPos,			uint64_t stopDim, uint64_t parentPos,
	Show All 12 Lines

mlir/test/Dialect/SparseTensor/conversion_sparse2sparse.mlir

	Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	// CHECK-AUTO-DAG: %[[Y:.*]] = memref.cast %[[Q]] : memref<1xindex> to memref<?xindex>			// CHECK-AUTO-DAG: %[[Y:.*]] = memref.cast %[[Q]] : memref<1xindex> to memref<?xindex>
	// CHECK-AUTO-DAG: %[[Z:.*]] = memref.cast %[[R]] : memref<1xindex> to memref<?xindex>			// CHECK-AUTO-DAG: %[[Z:.*]] = memref.cast %[[R]] : memref<1xindex> to memref<?xindex>
	// CHECK-AUTO: %[[T:.]] = call @newSparseTensor(%[[X]], %[[Y]], %[[Z]], %{{.}}, %{{.}}, %{{.}}, %[[SparseToSparse]], %[[A]])			// CHECK-AUTO: %[[T:.]] = call @newSparseTensor(%[[X]], %[[Y]], %[[Z]], %{{.}}, %{{.}}, %{{.}}, %[[SparseToSparse]], %[[A]])
	// CHECK-AUTO: return %[[T]] : !llvm.ptr<i8>			// CHECK-AUTO: return %[[T]] : !llvm.ptr<i8>
	func.func @sparse_convert(%arg0: tensor<?xf32, #SparseVector64>) -> tensor<?xf32, #SparseVector32> {			func.func @sparse_convert(%arg0: tensor<?xf32, #SparseVector64>) -> tensor<?xf32, #SparseVector32> {
	%0 = sparse_tensor.convert %arg0 : tensor<?xf32, #SparseVector64> to tensor<?xf32, #SparseVector32>			%0 = sparse_tensor.convert %arg0 : tensor<?xf32, #SparseVector64> to tensor<?xf32, #SparseVector32>
	return %0 : tensor<?xf32, #SparseVector32>			return %0 : tensor<?xf32, #SparseVector32>
	}			}

				#SparseSingleton64 = #sparse_tensor.encoding<{
				dimLevelType = ["singleton"],
				pointerBitWidth = 64,
				indexBitWidth = 64
				}>

				#SparseSingleton32 = #sparse_tensor.encoding<{
				dimLevelType = ["singleton"],
				pointerBitWidth = 32,
				indexBitWidth = 32
				}>

				// CHECK-COO-LABEL: func @sparse_convert_singleton(
				// CHECK-COO-SAME: %[[A:.*]]: !llvm.ptr<i8>)
				// CHECK-COO-DAG: %[[ToCOO:.*]] = arith.constant 5 : i32
				// CHECK-COO-DAG: %[[FromCOO:.*]] = arith.constant 2 : i32
				// CHECK-COO-DAG: %[[P:.*]] = memref.alloca() : memref<1xi8>
				// CHECK-COO-DAG: %[[Q:.*]] = memref.alloca() : memref<1xindex>
				// CHECK-COO-DAG: %[[R:.*]] = memref.alloca() : memref<1xindex>
				// CHECK-COO-DAG: %[[X:.*]] = memref.cast %[[P]] : memref<1xi8> to memref<?xi8>
				// CHECK-COO-DAG: %[[Y:.*]] = memref.cast %[[Q]] : memref<1xindex> to memref<?xindex>
				// CHECK-COO-DAG: %[[Z:.*]] = memref.cast %[[R]] : memref<1xindex> to memref<?xindex>
				// CHECK-COO: %[[C:.]] = call @newSparseTensor(%[[X]], %[[Y]], %[[Z]], %{{.}}, %{{.}}, %{{.}}, %[[ToCOO]], %[[A]])
				// CHECK-COO: %[[T:.]] = call @newSparseTensor(%[[X]], %[[Y]], %[[Z]], %{{.}}, %{{.}}, %{{.}}, %[[FromCOO]], %[[C]])
				// CHECK-COO: call @delSparseTensorCOOF32(%[[C]])
				// CHECK-COO: return %[[T]] : !llvm.ptr<i8>
				// CHECK-AUTO-LABEL: func @sparse_convert_singleton(
				// CHECK-AUTO-SAME: %[[A:.*]]: !llvm.ptr<i8>)
				// CHECK-AUTO-DAG: %[[SparseToSparse:.*]] = arith.constant 3 : i32
				// CHECK-AUTO-DAG: %[[P:.*]] = memref.alloca() : memref<1xi8>
				// CHECK-AUTO-DAG: %[[Q:.*]] = memref.alloca() : memref<1xindex>
				// CHECK-AUTO-DAG: %[[R:.*]] = memref.alloca() : memref<1xindex>
				// CHECK-AUTO-DAG: %[[X:.*]] = memref.cast %[[P]] : memref<1xi8> to memref<?xi8>
				// CHECK-AUTO-DAG: %[[Y:.*]] = memref.cast %[[Q]] : memref<1xindex> to memref<?xindex>
				// CHECK-AUTO-DAG: %[[Z:.*]] = memref.cast %[[R]] : memref<1xindex> to memref<?xindex>
				// CHECK-AUTO: %[[T:.]] = call @newSparseTensor(%[[X]], %[[Y]], %[[Z]], %{{.}}, %{{.}}, %{{.}}, %[[SparseToSparse]], %[[A]])
				// CHECK-AUTO: return %[[T]] : !llvm.ptr<i8>
				func.func @sparse_convert_singleton(%arg0: tensor<?xf32, #SparseSingleton64>) -> tensor<?xf32, #SparseSingleton32> {
				%0 = sparse_tensor.convert %arg0 : tensor<?xf32, #SparseSingleton64> to tensor<?xf32, #SparseSingleton32>
				return %0 : tensor<?xf32, #SparseSingleton32>
				}

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2sparse.mlir

Show All 13 Lines	#Tensor2 = #sparse_tensor.encoding<{
dimLevelType = [ "dense", "compressed", "dense" ]		dimLevelType = [ "dense", "compressed", "dense" ]
}>		}>

#Tensor3 = #sparse_tensor.encoding<{		#Tensor3 = #sparse_tensor.encoding<{
dimLevelType = [ "dense", "dense", "compressed" ],		dimLevelType = [ "dense", "dense", "compressed" ],
dimOrdering = affine_map<(i,j,k) -> (i,k,j)>		dimOrdering = affine_map<(i,j,k) -> (i,k,j)>
}>		}>

		#SingletonTensor1 = #sparse_tensor.encoding<{
		dimLevelType = [ "dense", "compressed", "singleton" ]
		}>

		// This also checks the compressed->dense conversion (when there are zeros).
		#SingletonTensor2 = #sparse_tensor.encoding<{
		dimLevelType = [ "dense", "dense", "singleton" ]
		}>

		// This also checks the singleton->compressed conversion.
		#SingletonTensor3 = #sparse_tensor.encoding<{
		dimLevelType = [ "dense", "dense", "compressed" ]
		}>

module {		module {
//		//
// Utilities for output and releasing memory.		// Utilities for output and releasing memory.
//		//
func.func @dump(%arg0: tensor<2x3x4xf64>) {		func.func @dump(%arg0: tensor<2x3x4xf64>) {
%c0 = arith.constant 0 : index		%c0 = arith.constant 0 : index
%d0 = arith.constant -1.0 : f64		%d0 = arith.constant -1.0 : f64
%0 = vector.transfer_read %arg0[%c0, %c0, %c0], %d0: tensor<2x3x4xf64>, vector<2x3x4xf64>		%0 = vector.transfer_read %arg0[%c0, %c0, %c0], %d0: tensor<2x3x4xf64>, vector<2x3x4xf64>
vector.print %0 : vector<2x3x4xf64>		vector.print %0 : vector<2x3x4xf64>
return		return
}		}
func.func @dumpAndRelease_234(%arg0: tensor<2x3x4xf64>) {		func.func @dumpAndRelease_234(%arg0: tensor<2x3x4xf64>) {
call @dump(%arg0) : (tensor<2x3x4xf64>) -> ()		call @dump(%arg0) : (tensor<2x3x4xf64>) -> ()
return		return
}		}

//		//
// Main driver.		// The first test suite (for non-singleton DimLevelTypes).
//		//
func.func @entry() {		func.func @testNonSingleton() {
//		//
// Initialize a 3-dim dense tensor.		// Initialize a 3-dim dense tensor.
//		//
%src = arith.constant dense<[		%src = arith.constant dense<[
[ [ 1.0, 2.0, 3.0, 4.0 ],		[ [ 1.0, 2.0, 3.0, 4.0 ],
[ 5.0, 6.0, 7.0, 8.0 ],		[ 5.0, 6.0, 7.0, 8.0 ],
[ 9.0, 10.0, 11.0, 12.0 ] ],		[ 9.0, 10.0, 11.0, 12.0 ] ],
[ [ 13.0, 14.0, 15.0, 16.0 ],		[ [ 13.0, 14.0, 15.0, 16.0 ],
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	func.func @testNonSingleton() {
bufferization.dealloc_tensor %t23 : tensor<2x3x4xf64, #Tensor3>		bufferization.dealloc_tensor %t23 : tensor<2x3x4xf64, #Tensor3>
bufferization.dealloc_tensor %t31 : tensor<2x3x4xf64, #Tensor1>		bufferization.dealloc_tensor %t31 : tensor<2x3x4xf64, #Tensor1>
bufferization.dealloc_tensor %s1 : tensor<2x3x4xf64, #Tensor1>		bufferization.dealloc_tensor %s1 : tensor<2x3x4xf64, #Tensor1>
bufferization.dealloc_tensor %s2 : tensor<2x3x4xf64, #Tensor2>		bufferization.dealloc_tensor %s2 : tensor<2x3x4xf64, #Tensor2>
bufferization.dealloc_tensor %s3 : tensor<2x3x4xf64, #Tensor3>		bufferization.dealloc_tensor %s3 : tensor<2x3x4xf64, #Tensor3>

return		return
}		}

		//
		// The second test suite (for singleton DimLevelTypes).
		//
		func.func @testSingleton() {
		//
		// Initialize a 3-dim dense tensor with the 3rd dim being singleton.
		//
		%src = arith.constant dense<[
		[ [ 1.0, 0.0, 0.0, 0.0 ],
		[ 0.0, 6.0, 0.0, 0.0 ],
		[ 0.0, 0.0, 11.0, 0.0 ] ],
		[ [ 0.0, 14.0, 0.0, 0.0 ],
		[ 0.0, 0.0, 0.0, 20.0 ],
		[ 21.0, 0.0, 0.0, 0.0 ] ]
		]> : tensor<2x3x4xf64>

		//
		// Convert dense tensor directly to various sparse tensors.
		//
		%s1 = sparse_tensor.convert %src : tensor<2x3x4xf64> to tensor<2x3x4xf64, #SingletonTensor1>
		%s2 = sparse_tensor.convert %src : tensor<2x3x4xf64> to tensor<2x3x4xf64, #SingletonTensor2>
		%s3 = sparse_tensor.convert %src : tensor<2x3x4xf64> to tensor<2x3x4xf64, #SingletonTensor3>

		//
		// Convert sparse tensor directly to another sparse format.
		//
		%t12 = sparse_tensor.convert %s1 : tensor<2x3x4xf64, #SingletonTensor1> to tensor<2x3x4xf64, #SingletonTensor2>
		%t13 = sparse_tensor.convert %s1 : tensor<2x3x4xf64, #SingletonTensor1> to tensor<2x3x4xf64, #SingletonTensor3>
		%t21 = sparse_tensor.convert %s2 : tensor<2x3x4xf64, #SingletonTensor2> to tensor<2x3x4xf64, #SingletonTensor1>
		%t23 = sparse_tensor.convert %s2 : tensor<2x3x4xf64, #SingletonTensor2> to tensor<2x3x4xf64, #SingletonTensor3>
		%t31 = sparse_tensor.convert %s3 : tensor<2x3x4xf64, #SingletonTensor3> to tensor<2x3x4xf64, #SingletonTensor1>
		%t32 = sparse_tensor.convert %s3 : tensor<2x3x4xf64, #SingletonTensor3> to tensor<2x3x4xf64, #SingletonTensor2>

		//
		// Convert sparse tensor back to dense.
		//
		%d12 = sparse_tensor.convert %t12 : tensor<2x3x4xf64, #SingletonTensor2> to tensor<2x3x4xf64>
		%d13 = sparse_tensor.convert %t13 : tensor<2x3x4xf64, #SingletonTensor3> to tensor<2x3x4xf64>
		%d21 = sparse_tensor.convert %t21 : tensor<2x3x4xf64, #SingletonTensor1> to tensor<2x3x4xf64>
		%d23 = sparse_tensor.convert %t23 : tensor<2x3x4xf64, #SingletonTensor3> to tensor<2x3x4xf64>
		%d31 = sparse_tensor.convert %t31 : tensor<2x3x4xf64, #SingletonTensor1> to tensor<2x3x4xf64>
		%d32 = sparse_tensor.convert %t32 : tensor<2x3x4xf64, #SingletonTensor2> to tensor<2x3x4xf64>

		//
		// Check round-trip equality. And release dense tensors.
		//
		// CHECK-COUNT-7: ( ( ( 1, 0, 0, 0 ), ( 0, 6, 0, 0 ), ( 0, 0, 11, 0 ) ), ( ( 0, 14, 0, 0 ), ( 0, 0, 0, 20 ), ( 21, 0, 0, 0 ) ) )
		call @dump(%src) : (tensor<2x3x4xf64>) -> ()
		call @dumpAndRelease_234(%d12) : (tensor<2x3x4xf64>) -> ()
		call @dumpAndRelease_234(%d13) : (tensor<2x3x4xf64>) -> ()
		call @dumpAndRelease_234(%d21) : (tensor<2x3x4xf64>) -> ()
		call @dumpAndRelease_234(%d23) : (tensor<2x3x4xf64>) -> ()
		call @dumpAndRelease_234(%d31) : (tensor<2x3x4xf64>) -> ()
		call @dumpAndRelease_234(%d32) : (tensor<2x3x4xf64>) -> ()

		//
		// Release sparse tensors.
		//
		bufferization.dealloc_tensor %t12 : tensor<2x3x4xf64, #SingletonTensor2>
		bufferization.dealloc_tensor %t13 : tensor<2x3x4xf64, #SingletonTensor3>
		bufferization.dealloc_tensor %t21 : tensor<2x3x4xf64, #SingletonTensor1>
		bufferization.dealloc_tensor %t23 : tensor<2x3x4xf64, #SingletonTensor3>
		bufferization.dealloc_tensor %t31 : tensor<2x3x4xf64, #SingletonTensor1>
		bufferization.dealloc_tensor %t32 : tensor<2x3x4xf64, #SingletonTensor2>

		return
		}

		//
		// Main driver.
		//
		func.func @entry() {
		call @testNonSingleton() : () -> ()
		call @testSingleton() : () -> ()
		return
		}
}		}

mlir/test/Integration/Dialect/SparseTensor/python/test_stress.py

Show First 20 Lines • Show All 183 Lines • ▼ Show 20 Lines	def main():
print("\nTEST: test_stress")		print("\nTEST: test_stress")
with ir.Context() as ctx, ir.Location.unknown():		with ir.Context() as ctx, ir.Location.unknown():
vl = 1		vl = 1
e = False		e = False
# Disable direct sparse2sparse conversion, because it doubles the time!		# Disable direct sparse2sparse conversion, because it doubles the time!
# TODO: While direct s2s is far too slow for per-commit testing,		# TODO: While direct s2s is far too slow for per-commit testing,
# we should have some framework ensure that we run this test with		# we should have some framework ensure that we run this test with
# `s2s=0` on a regular basis, to ensure that it does continue to work.		# `s2s=0` on a regular basis, to ensure that it does continue to work.
		# TODO: be sure to test s2s=0 together with singletons.
s2s = 1		s2s = 1
sparsification_options = (		sparsification_options = (
f'parallelization-strategy=none '		f'parallelization-strategy=none '
f'vectorization-strategy=none '		f'vectorization-strategy=none '
f'vl={vl} '		f'vl={vl} '
f'enable-simd-index32={e} '		f'enable-simd-index32={e} '
f's2s-strategy={s2s}')		f's2s-strategy={s2s}')
compiler = sparse_compiler.SparseCompiler(		compiler = sparse_compiler.SparseCompiler(
options=sparsification_options, opt_level=0, shared_libs=[support_lib])		options=sparsification_options, opt_level=0, shared_libs=[support_lib])
f64 = ir.F64Type.get()		f64 = ir.F64Type.get()
# Be careful about increasing this because		# Be careful about increasing this because
# len(types) = 1 + 2^rank * rank! * len(bitwidths)^2		# len(types) = 1 + len(level_choices)^rank * rank! * len(bitwidths)^2
shape = range(2, 6)		shape = range(2, 6)
rank = len(shape)		rank = len(shape)
# All combinations.		# All combinations.
		# TODO: add singleton here too; which requires updating how `np_arg0`
		# is initialized below.
levels = list(itertools.product(*itertools.repeat(		levels = list(itertools.product(*itertools.repeat(
[st.DimLevelType.dense, st.DimLevelType.compressed], rank)))		[st.DimLevelType.dense, st.DimLevelType.compressed], rank)))
# All permutations.		# All permutations.
orderings = list(map(ir.AffineMap.get_permutation,		orderings = list(map(ir.AffineMap.get_permutation,
itertools.permutations(range(rank))))		itertools.permutations(range(rank))))
bitwidths = [0]		bitwidths = [0]
# The first type must be a dense tensor for numpy conversion to work.		# The first type must be a dense tensor for numpy conversion to work.
types = [ir.RankedTensorType.get(shape, f64)]		types = [ir.RankedTensorType.get(shape, f64)]
Show All 31 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse] further implement singleton dimension level typeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 464126

mlir/include/mlir/ExecutionEngine/SparseTensor/Storage.h

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp

mlir/lib/ExecutionEngine/SparseTensor/NNZ.cpp

mlir/test/Dialect/SparseTensor/conversion_sparse2sparse.mlir

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2sparse.mlir

mlir/test/Integration/Dialect/SparseTensor/python/test_stress.py

[mlir][sparse] further implement singleton dimension level type
ClosedPublic