This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/ExecutionEngine/
-
mlir/
-
ExecutionEngine/
4/4
Float16bits.h
-
SparseTensor/
7/7
COO.h
1/2
CheckedMul.h
3/5
Enums.h
3/3
ErrorHandling.h
1/1
File.h
2/3
Storage.h
2/3
SparseTensorUtils.h
-
lib/
-
Dialect/SparseTensor/Transforms/
-
SparseTensor/
-
Transforms/
-
CodegenUtils.h
-
SparseTensorConversion.cpp
-
ExecutionEngine/
-
CMakeLists.txt
-
Float16bits.cpp
-
SparseTensor/
-
CMakeLists.txt
2/2
File.cpp
-
NNZ.cpp
-
Storage.cpp
1/1
SparseTensorUtils.cpp
-
utils/bazel/llvm-project-overlay/mlir/
-
bazel/
-
llvm-project-overlay/
-
mlir/
-
BUILD.bazel

Differential D133462

[mlir][sparse] refactoring SparseTensorUtils: (1 of 4) file-splitting
ClosedPublic

Authored by wrengr on Sep 7 2022, 4:44 PM.

Download Raw Diff

Details

Reviewers

aartbik
bixia
penpornk
Peiming
nicolasvasilache

Commits

rG0fca5c5f45c3: [mlir][sparse] refactoring SparseTensorUtils: (1 of 4) file-splitting

Summary

Previously, the SparseTensorUtils.cpp library contained a C++ core implementation, but hid it in an anonymous namespace and only exposed a C-API for accessing it. Now we are factoring out that C++ core into a standalone C++ library so that it can be used directly by downstream clients (per request of one such client). This refactoring has been decomposed into a stack of differentials in order to simplify the code review process, however the full stack of changes should be considered together.

(this): Part 1: split one file into several
D133830: Part 2: Reorder chunks within files
D133831: Part 3: General code cleanup
D133833: Part 4: Update documentation

This part aims to make no changes other than the 1:N file splitting, and things which are forced to accompany that change.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

wrengr created this revision.Sep 7 2022, 4:44 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 7 2022, 4:44 PM

Herald added subscribers: anlunx, bzcheeseman, sdasgup3 and 19 others. · View Herald Transcript

wrengr requested review of this revision.Sep 7 2022, 4:44 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 7 2022, 4:44 PM

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Harbormaster completed remote builds in B185530: Diff 458605.Sep 7 2022, 5:00 PM

overall direction looks good, but I am having a very hard time reviewing the refactoring *and* the many comment/code changes at the same time

is there some tooling available for this I am not aware of (seems that a 1:N refactoring is not that uncommon)....

mlir/include/mlir/ExecutionEngine/Float16bits.h
19	Just curious, is there a plan forward to fix this? If not, perhaps we don't need this comment, or perhaps not as part of this revision?
mlir/include/mlir/ExecutionEngine/SparseTensor/COO.h
30	In this refactoring, you broke existing code up and you changed a lot of comments. Is there any way in phabricator or git that allows me to do a side-by-side comparison of copied stuff. For example, this Element comment looks like the original comment I wrote, but you also made many changes.
mlir/include/mlir/ExecutionEngine/SparseTensor/CheckedMul.h
34	I actually don't think we need so much real estate for just this check
mlir/include/mlir/ExecutionEngine/SparseTensor/File.h
266	extra line before closing #endif
mlir/include/mlir/ExecutionEngine/SparseTensor/Storage.h
882	empty line
mlir/lib/ExecutionEngine/SparseTensorUtils.cpp
164	This was before the ===- on purpose (there are two ===- headers inside the extern. By placing it here is seems to suggest only this "public functions" part is extern

Thanks for doing this Wren! I just have a few nitpicks to add :)

mlir/include/mlir/ExecutionEngine/SparseTensor/COO.h
76	Can we use std::span or ArrayRef here (and other places that use const std::vector&)?
172	Why not define a real iterator and use traditional iterator semantics?
mlir/include/mlir/ExecutionEngine/SparseTensor/ErrorHandling.h
35	Could this be something like MLIR_SPARSETENSOR_FATAL since this is included in public headers?
mlir/include/mlir/ExecutionEngine/SparseTensor/Storage.h
84	Can these be ArrayRefs also?
mlir/include/mlir/ExecutionEngine/SparseTensorUtils.h
24–25	I think even a namespace, extern "C" causes these symbols to share the global namespace. It might be better to remove the namespace and prefix each symbol to avoid collisions?
mlir/lib/ExecutionEngine/SparseTensor/File.cpp
32	Why not enclose the definitions below in the namespace?

I tried a few internal tools, but none detected the moved code in a smart way :-(

Perhaps you can give some outline on where you changed code/comments, and were not,
so I can do a bit more focused review of internals that changed.

mlir/include/mlir/ExecutionEngine/SparseTensor/COO.h
145	Would it be worthwhile to be smart about isSorted here (i.e. do not invalidate by comparing last with new?)
192	empty line before #endif
mlir/include/mlir/ExecutionEngine/SparseTensor/Enums.h
148	empty line
mlir/include/mlir/ExecutionEngine/SparseTensorUtils.h
1	This reference to enums seems very outdated now
211	empty line before closing #endif

aartbik added inline comments.Sep 12 2022, 4:15 PM

mlir/include/mlir/ExecutionEngine/SparseTensor/ErrorHandling.h
46	don't check in commented out code as a general rule

overall direction looks good, but I am having a very hard time reviewing the refactoring *and* the many comment/code changes at the same time

I'll split the nontrivial changes off into separate CLs. Should've done that before, but I was hoping the changes were local enough to be easily reviewable.

is there some tooling available for this I am not aware of (seems that a 1:N refactoring is not that uncommon)....

Not that I'm aware of, though if anyone else knows of one I'd love to know. I myself tend to do a lot of 1:N refactorings, so it'd really help my reviewers :)

mlir/include/mlir/ExecutionEngine/Float16bits.h
19	No plans at the moment, I just forgot to remove the note-to-self
mlir/include/mlir/ExecutionEngine/SparseTensor/COO.h
76	Alas, we can't use `ArrayRef` since this library doesn't depend on llvm/mlir (just like how we have to use `std::function` in lieu of `llvm::function_ref`). But, if that's a strong ask then I can talk with Mehdi to see if we can't figure out a way to introduce the dependency without causing issues for the rest of the ExecutionEngine stuff. I'm not familiar with `std::span` but I'll take a look to see if it can be an appropriate replacement
172	The `SparseTensorCOO` class was originally designed as an entirely internal format, and the iterator stuff was originally introduced for the sake of sparse2sparse conversion. Back when implementing sparse2sparse conversion, Aart and I decided it was cleaner to take this approach rather than implement a C++-style iterator, since the iterator was just in service of MLIR codegen rather than ever being used from C++ itself. Fwiw, this is also why the `SparseTensorEnumerator` is designed as it is rather than as a C++-style iterator. Of course, the design tradeoffs are different now that we're factoring things out to be used by C++ itself rather than just by MLIR codegen. After this CL lands I can work on converting this to a traditional C++ iterator. Changing `SparseTensorEnumerator` is a much larger undertaking (and something I'd like to hold off on as long as possible, since it's likely to change substantially once we introduce support for block sparsity etc); but changing `SparseTensorCOO` is simple enough.
mlir/include/mlir/ExecutionEngine/SparseTensor/CheckedMul.h
34	Is having a single header file really that much realestate? I'd rather not remove the checks, since that would introduce a regression against the current code.
mlir/include/mlir/ExecutionEngine/SparseTensor/ErrorHandling.h
35	Yes of course :) I was hoping to get the functional version below to work before landing this CL, and just forgot to do the renaming when I ran afoul of the `[-Wformat-security]` warnings.
mlir/lib/ExecutionEngine/SparseTensor/File.cpp
32	It's LLVM/MLIR style and guards against future bugs

Revert all changes other than:

File splitting
removing static keyword (since it means differently in headers)
namespacing
rename FATAL macro
git-clang-format

wrengr edited the summary of this revision. (Show Details)Sep 13 2022, 1:47 PM

Harbormaster completed remote builds in B186447: Diff 459866.Sep 13 2022, 2:14 PM

wrengr added a child revision: D133830: [mlir][sparse] refactoring SparseTensorUtils: (2 of 4) reordering.Sep 13 2022, 9:10 PM

wrengr mentioned this in D133831: [mlir][sparse] refactoring SparseTensorUtils: (3 of 4) code-cleanup.Sep 13 2022, 9:15 PM

wrengr mentioned this in D133833: [mlir][sparse] refactoring SparseTensorUtils: (4 of 4) documentation.Sep 13 2022, 9:21 PM

wrengr retitled this revision from [mlir][sparse] Factoring out the SparseTensorUtils library to [mlir][sparse] refactoring SparseTensorUtils: (1 of 4) file-splitting.Sep 13 2022, 9:23 PM

wrengr edited the summary of this revision. (Show Details)

wrengr mentioned this in D133835: [mlir][sparse] Factoring out SparseTensorFile::canReadAs predicate.Sep 13 2022, 9:40 PM

wrengr mentioned this in D133836: [mlir][sparse] Improve sparse_tensor::detail::readCOOValue template.Sep 13 2022, 9:48 PM

wrengr mentioned this in D133837: [mlir][sparse] Cleaning up SparseTensorFile::readMMEHeader.Sep 13 2022, 9:56 PM

wrengr mentioned this in D133838: [mlir][sparse] optimizing permutation validity check in toMLIRSparseTensor.Sep 13 2022, 10:03 PM

wrengr mentioned this in D133839: [mlir][sparse] Adding isSorted bit to SparseTensorCOO.Sep 13 2022, 10:09 PM

Thanks for this break up. Much easier to review now with confidence.

mlir/include/mlir/ExecutionEngine/Float16bits.h
19	Still there? ;-)
mlir/lib/ExecutionEngine/SparseTensor/StorageBase.cpp
1 ↗	(On Diff #459866)	I realize this file only implements StorageBase code (due to templating) but for some weird sense of symmetry, would you be open to simply calling this file Storage.cpp as counterpart of Storage.h No very strong opinion though, so your call..

This revision is now accepted and ready to land.Sep 14 2022, 2:14 PM

wrengr marked 4 inline comments as done.Sep 15 2022, 5:44 PM

wrengr added inline comments.

mlir/include/mlir/ExecutionEngine/Float16bits.h
19	what?! How did that get lost in the rebasing... Let's try this again :)
mlir/lib/ExecutionEngine/SparseTensor/StorageBase.cpp
1 ↗	(On Diff #459866)	I originally had it called Storage.cpp, but then renamed it because there's also NNZ.cpp; though I don't feel strongly about it either way

Addressing nits, and rebasing

Harbormaster completed remote builds in B186996: Diff 460573.Sep 15 2022, 6:17 PM

Attempting to fix the Windows build error

Harbormaster completed remote builds in B187649: Diff 461428.Sep 19 2022, 7:43 PM

Attempting to fix the Windows build error (2/n)

Harbormaster completed remote builds in B187815: Diff 461671.Sep 20 2022, 1:25 PM

Attempting to fix the Windows build error (3/n)

Harbormaster completed remote builds in B187830: Diff 461690.Sep 20 2022, 2:28 PM

Attempting to fix the Windows build error (4/n)

Harbormaster completed remote builds in B187854: Diff 461724.Sep 20 2022, 3:53 PM

I just noticed the Debian CMake build was broken by the last few changes; so reverting to the last known-good CMake files for Debian. Windows is still failing at the linking step though

Harbormaster completed remote builds in B187873: Diff 461750.Sep 20 2022, 4:15 PM

Attempting to fix the Windows build error (5/n)

Harbormaster completed remote builds in B187884: Diff 461764.Sep 20 2022, 5:06 PM

Incorporating D134096.

Also fleshing out the foo_EXPORTS idiom for (hopefully) getting things to link on Windows. (This idiom is used in other ExecutionEngine libraries, and seems to be the same as the one described here: https://gernotklingler.com/blog/creating-using-shared-libraries-different-compilers-different-operating-systems/)

Harbormaster completed remote builds in B187888: Diff 461772.Sep 20 2022, 6:10 PM

aartbik added inline comments.Sep 21 2022, 7:12 PM

mlir/include/mlir/ExecutionEngine/SparseTensor/Storage.h
147	thanks for pre-merging this!

Herald added a subscriber: zero9178. · View Herald TranscriptSep 21 2022, 7:12 PM

Attempting to fix the Windows build error (6/n); fully fleshed out the foo_EXPORTS idiom for the mlir_sparsetensor_utils library.

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptSep 28 2022, 3:52 PM

Harbormaster completed remote builds in B189269: Diff 463697.Sep 28 2022, 4:11 PM

Adding cpp guard for out-of-line definitions in header files (fixes "error C2491" in Storage.h)

Harbormaster completed remote builds in B189280: Diff 463713.Sep 28 2022, 4:58 PM

Making the mlir_sparsetensor_utils library static, so as to avoid warnings (and other issues) on Windows.

Harbormaster completed remote builds in B189294: Diff 463728.Sep 28 2022, 6:12 PM

removing the EXPORT.h file, since no longer necessary

Harbormaster completed remote builds in B189472: Diff 463979.Sep 29 2022, 12:31 PM

rebase

Harbormaster completed remote builds in B189499: Diff 464019.Sep 29 2022, 2:16 PM

Closed by commit rG0fca5c5f45c3: [mlir][sparse] refactoring SparseTensorUtils: (1 of 4) file-splitting (authored by wrengr). · Explain WhySep 29 2022, 2:35 PM

This revision was automatically updated to reflect the committed changes.

wrengr added a commit: rG0fca5c5f45c3: [mlir][sparse] refactoring SparseTensorUtils: (1 of 4) file-splitting.

wrengr mentioned this in rG329f2f103af1: [mlir][sparse] refactoring SparseTensorUtils: (3 of 4) code-cleanup.Sep 29 2022, 2:44 PM

wrengr mentioned this in rG164b66f796dd: [mlir][sparse] refactoring SparseTensorUtils: (4 of 4) documentation.

wrengr mentioned this in rGc8177f845b41: [mlir][sparse] Factoring out SparseTensorFile::canReadAs predicate.Sep 29 2022, 2:46 PM

wrengr mentioned this in rG4792b8ae869c: [mlir][sparse] Cleaning up SparseTensorFile::readMMEHeader.Sep 29 2022, 3:00 PM

wrengr mentioned this in rGac741889c144: [mlir][sparse] Adding isSorted bit to SparseTensorCOO.

wrengr mentioned this in rGc42ecce7b922: [mlir][sparse] optimizing permutation validity check in toMLIRSparseTensor.Sep 29 2022, 3:08 PM

wrengr mentioned this in rG68609598e45f: [mlir][sparse] Improve sparse_tensor::detail::readCOOValue template.Sep 29 2022, 3:26 PM

stella.stamenova mentioned this in D134933: [mlir][sparse] further implement singleton dimension level type.Oct 7 2022, 10:59 AM

aganea added a subscriber: aganea.Oct 7 2022, 4:06 PM

aganea added inline comments.

mlir/include/mlir/ExecutionEngine/SparseTensor/Enums.h

Hello @wrengr! I'm seeing a bunch of warnings like this when building with clang-cl 14.0.6 on Windows:

C:/git/llvm-project/mlir/include\mlir/ExecutionEngine/SparseTensor/Enums.h(58,12): warning: 'dllexport' attribute only applies to functions, variables, classes, and Objective-C interfaces [-Wignored-attributes]
enum class MLIR_SPARSETENSOR_EXPORT OverheadType : uint32_t {
           ^
C:/git/llvm-project/mlir/include\mlir/ExecutionEngine/SparseTensor/Enums.h(36,45): note: expanded from macro 'MLIR_SPARSETENSOR_EXPORT'
#define MLIR_SPARSETENSOR_EXPORT __declspec(dllexport)
                                            ^

The dllimport/dllexport can be omitted here on Windows because an enum is a type not a symbol. For non-Windows, do you really need the __attribute__((visibility("default"))) on enums?

wrengr added inline comments.Oct 7 2022, 4:50 PM

mlir/include/mlir/ExecutionEngine/SparseTensor/Enums.h
58	For non-Windows, do you really need the attribute((visibility("default"))) on enums? Probably not! On Linux everything works fine when `MLIR_SPARSETENSOR_EXPORT` expands to nothing (both when used on functions and on enums). This is my first time working with DLLs on Windows, so I just defined the macro following the idiom used elsewhere in the MLIR codebase. For MSVC and MSYS2 I thought the dllexport/dllimport stuff needed to be applied to `enum class` definitions (not just `class`/`struct` definitions), but I may very well be wrong. I'm about to post a differential for trying to resolve some other Windows issues (https://reviews.llvm.org/D134933#3843372). Once that's up could I get you to try building it to make sure it works on your Windows+Clang setup?

aganea added inline comments.Oct 7 2022, 4:56 PM

mlir/include/mlir/ExecutionEngine/SparseTensor/Enums.h
58	I will certainly try that! Thanks!

wrengr mentioned this in D135502: [mlir][sparse] Removing DLL attributes from ExecutionEngine/SparseTensor/Enums.h.Oct 7 2022, 5:00 PM

wrengr added inline comments.Oct 7 2022, 5:03 PM

mlir/include/mlir/ExecutionEngine/SparseTensor/Enums.h
58	I just uploaded D135502, so please comment there to let me know how it goes :)

wrengr mentioned this in rG1aa06aeb1a00: [mlir][sparse] Removing DLL attributes from ExecutionEngine/SparseTensor/Enums.h.Oct 10 2022, 11:22 AM

Revision Contents

Path

Size

mlir/

include/

mlir/

ExecutionEngine/

Float16bits.h

21 lines

SparseTensor/

163 lines

43 lines

163 lines

42 lines

259 lines

917 lines

121 lines

lib/

Dialect/

SparseTensor/

Transforms/

CodegenUtils.h

2 lines

SparseTensorConversion.cpp

2 lines

ExecutionEngine/

CMakeLists.txt

15 lines

Float16bits.cpp

5 lines

SparseTensor/

23 lines

163 lines

88 lines

100 lines

SparseTensorUtils.cpp

1377 lines

utils/

bazel/

llvm-project-overlay/

mlir/

BUILD.bazel

58 lines

Diff 464033

mlir/include/mlir/ExecutionEngine/Float16bits.h

	Show All 10 Lines
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef MLIR_EXECUTIONENGINE_FLOAT16BITS_H_			#ifndef MLIR_EXECUTIONENGINE_FLOAT16BITS_H_
	#define MLIR_EXECUTIONENGINE_FLOAT16BITS_H_			#define MLIR_EXECUTIONENGINE_FLOAT16BITS_H_

	#include <cstdint>			#include <cstdint>
	#include <iostream>			#include <iostream>

				aartbikUnsubmitted Done Reply Inline Actions Just curious, is there a plan forward to fix this? If not, perhaps we don't need this comment, or perhaps not as part of this revision? aartbik: Just curious, is there a plan forward to fix this? If not, perhaps we don't need this comment…
				wrengrAuthorUnsubmitted Done Reply Inline Actions No plans at the moment, I just forgot to remove the note-to-self wrengr: No plans at the moment, I just forgot to remove the note-to-self
				aartbikUnsubmitted Done Reply Inline Actions Still there? ;-) aartbik: Still there? ;-)
				wrengrAuthorUnsubmitted Done Reply Inline Actions what?! How did that get lost in the rebasing... Let's try this again :) wrengr: what?! How did that get lost in the rebasing... Let's try this again :)
				#ifdef _WIN32
				#ifdef mlir_float16_utils_EXPORTS // We are building this library
				#define MLIR_FLOAT16_EXPORT __declspec(dllexport)
				#define MLIR_FLOAT16_DEFINE_FUNCTIONS
				#else // We are using this library
				#define MLIR_FLOAT16_EXPORT __declspec(dllimport)
				#endif // mlir_float16_utils_EXPORTS
				#else // Non-windows: use visibility attributes.
				#define MLIR_FLOAT16_EXPORT __attribute__((visibility("default")))
				#define MLIR_FLOAT16_DEFINE_FUNCTIONS
				#endif // _WIN32

	// Implements half precision and bfloat with f16 and bf16, using the MLIR type			// Implements half precision and bfloat with f16 and bf16, using the MLIR type
	// names. These data types are also used for c-interface runtime routines.			// names. These data types are also used for c-interface runtime routines.
	extern "C" {			extern "C" {
	struct f16 {			struct MLIR_FLOAT16_EXPORT f16 {
	f16(float f = 0);			f16(float f = 0);
	uint16_t bits;			uint16_t bits;
	};			};

	struct bf16 {			struct MLIR_FLOAT16_EXPORT bf16 {
	bf16(float f = 0);			bf16(float f = 0);
	uint16_t bits;			uint16_t bits;
	};			};
	}			}

	// Outputs a half precision value.			// Outputs a half precision value.
	std::ostream &operator<<(std::ostream &os, const f16 &f);			MLIR_FLOAT16_EXPORT std::ostream &operator<<(std::ostream &os, const f16 &f);
	// Outputs a bfloat value.			// Outputs a bfloat value.
	std::ostream &operator<<(std::ostream &os, const bf16 &d);			MLIR_FLOAT16_EXPORT std::ostream &operator<<(std::ostream &os, const bf16 &d);

				#undef MLIR_FLOAT16_EXPORT
	#endif // MLIR_EXECUTIONENGINE_FLOAT16BITS_H_			#endif // MLIR_EXECUTIONENGINE_FLOAT16BITS_H_

mlir/include/mlir/ExecutionEngine/SparseTensor/COO.h

This file was added.

				//===- COO.h - Coordinate-scheme sparse tensor representation ---- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file is part of the lightweight runtime support library for sparse
				// tensor manipulations. The functionality of the support library is meant
				// to simplify benchmarking, testing, and debugging MLIR code operating on
				// sparse tensors. However, the provided functionality is not part of
				// core MLIR itself.
				//
				//===----------------------------------------------------------------------===//

				#ifndef MLIR_EXECUTIONENGINE_SPARSETENSOR_COO_H
				#define MLIR_EXECUTIONENGINE_SPARSETENSOR_COO_H

				#include <algorithm>
				#include <cassert>
				#include <cinttypes>
				#include <functional>
				#include <vector>

				namespace mlir {
				namespace sparse_tensor {

				/// A sparse tensor element in coordinate scheme (value and indices).
				/// For example, a rank-1 vector element would look like
				aartbikUnsubmitted Done Reply Inline Actions In this refactoring, you broke existing code up and you changed a lot of comments. Is there any way in phabricator or git that allows me to do a side-by-side comparison of copied stuff. For example, this Element comment looks like the original comment I wrote, but you also made many changes. aartbik: In this refactoring, you broke existing code up and you changed a lot of comments. Is there…
				/// ({i}, a[i])
				/// and a rank-5 tensor element like
				/// ({i,j,k,l,m}, a[i,j,k,l,m])
				/// We use pointer to a shared index pool rather than e.g. a direct
				/// vector since that (1) reduces the per-element memory footprint, and
				/// (2) centralizes the memory reservation and (re)allocation to one place.
				template <typename V>
				struct Element final {
				Element(uint64_t *ind, V val) : indices(ind), value(val){};
				uint64_t *indices; // pointer into shared index pool
				V value;
				};

				/// The type of callback functions which receive an element. We avoid
				/// packaging the coordinates and value together as an `Element` object
				/// because this helps keep code somewhat cleaner.
				template <typename V>
				using ElementConsumer =
				const std::function<void(const std::vector<uint64_t> &, V)> &;

				/// A memory-resident sparse tensor in coordinate scheme (collection of
				/// elements). This data structure is used to read a sparse tensor from
				/// any external format into memory and sort the elements lexicographically
				/// by indices before passing it back to the client (most packed storage
				/// formats require the elements to appear in lexicographic index order).
				template <typename V>
				struct SparseTensorCOO final {
				public:
				SparseTensorCOO(const std::vector<uint64_t> &dimSizes, uint64_t capacity)
				: dimSizes(dimSizes) {
				if (capacity) {
				elements.reserve(capacity);
				indices.reserve(capacity * getRank());
				}
				}

				/// Adds element as indices and value.
				void add(const std::vector<uint64_t> &ind, V val) {
				assert(!iteratorLocked && "Attempt to add() after startIterator()");
				uint64_t *base = indices.data();
				uint64_t size = indices.size();
				uint64_t rank = getRank();
				assert(ind.size() == rank && "Element rank mismatch");
				for (uint64_t r = 0; r < rank; r++) {
				assert(ind[r] < dimSizes[r] && "Index is too large for the dimension");
				indices.push_back(ind[r]);
				pgavinUnsubmitted Done Reply Inline Actions Can we use std::span or ArrayRef here (and other places that use const std::vector&)? pgavin: Can we use std::span or ArrayRef here (and other places that use const std::vector&)?
				wrengrAuthorUnsubmitted Done Reply Inline Actions Alas, we can't use `ArrayRef` since this library doesn't depend on llvm/mlir (just like how we have to use `std::function` in lieu of `llvm::function_ref`). But, if that's a strong ask then I can talk with Mehdi to see if we can't figure out a way to introduce the dependency without causing issues for the rest of the ExecutionEngine stuff. I'm not familiar with `std::span` but I'll take a look to see if it can be an appropriate replacement wrengr: Alas, we can't use `ArrayRef` since this library doesn't depend on llvm/mlir (just like how we…
				}
				// This base only changes if indices were reallocated. In that case, we
				// need to correct all previous pointers into the vector. Note that this
				// only happens if we did not set the initial capacity right, and then only
				// for every internal vector reallocation (which with the doubling rule
				// should only incur an amortized linear overhead).
				uint64_t *newBase = indices.data();
				if (newBase != base) {
				for (uint64_t i = 0, n = elements.size(); i < n; i++)
				elements[i].indices = newBase + (elements[i].indices - base);
				base = newBase;
				}
				// Add element as (pointer into shared index pool, value) pair.
				elements.emplace_back(base + size, val);
				}

				/// Sorts elements lexicographically by index.
				void sort() {
				assert(!iteratorLocked && "Attempt to sort() after startIterator()");
				// TODO: we may want to cache an `isSorted` bit, to avoid
				// unnecessary/redundant sorting.
				uint64_t rank = getRank();
				std::sort(elements.begin(), elements.end(),
				[rank](const Element<V> &e1, const Element<V> &e2) {
				for (uint64_t r = 0; r < rank; r++) {
				if (e1.indices[r] == e2.indices[r])
				continue;
				return e1.indices[r] < e2.indices[r];
				}
				return false;
				});
				}

				/// Get the rank of the tensor.
				uint64_t getRank() const { return dimSizes.size(); }

				/// Getter for the dimension-sizes array.
				const std::vector<uint64_t> &getDimSizes() const { return dimSizes; }

				/// Getter for the elements array.
				const std::vector<Element<V>> &getElements() const { return elements; }

				/// Switch into iterator mode.
				void startIterator() {
				iteratorLocked = true;
				iteratorPos = 0;
				}

				/// Get the next element.
				const Element<V> *getNext() {
				assert(iteratorLocked && "Attempt to getNext() before startIterator()");
				if (iteratorPos < elements.size())
				return &(elements[iteratorPos++]);
				iteratorLocked = false;
				return nullptr;
				}

				/// Factory method. Permutes the original dimensions according to
				/// the given ordering and expects subsequent add() calls to honor
				/// that same ordering for the given indices. The result is a
				/// fully permuted coordinate scheme.
				///
				/// Precondition: `dimSizes` and `perm` must be valid for `rank`.
				static SparseTensorCOO<V> *newSparseTensorCOO(uint64_t rank,
				const uint64_t *dimSizes,
				const uint64_t *perm,
				uint64_t capacity = 0) {
				std::vector<uint64_t> permsz(rank);
				for (uint64_t r = 0; r < rank; r++) {
				aartbikUnsubmitted Done Reply Inline Actions Would it be worthwhile to be smart about isSorted here (i.e. do not invalidate by comparing last with new?) aartbik: Would it be worthwhile to be smart about isSorted here (i.e. do not invalidate by comparing…
				assert(dimSizes[r] > 0 && "Dimension size zero has trivial storage");
				permsz[perm[r]] = dimSizes[r];
				}
				return new SparseTensorCOO<V>(permsz, capacity);
				}

				private:
				const std::vector<uint64_t> dimSizes; // per-dimension sizes
				std::vector<Element<V>> elements; // all COO elements
				std::vector<uint64_t> indices; // shared index pool
				bool iteratorLocked = false;
				unsigned iteratorPos = 0;
				};

				} // namespace sparse_tensor
				} // namespace mlir

				#endif // MLIR_EXECUTIONENGINE_SPARSETENSOR_COO_H
				pgavinUnsubmitted Done Reply Inline Actions Why not define a real iterator and use traditional iterator semantics? pgavin: Why not define a real iterator and use traditional iterator semantics?
				wrengrAuthorUnsubmitted Done Reply Inline Actions The `SparseTensorCOO` class was originally designed as an entirely internal format, and the iterator stuff was originally introduced for the sake of sparse2sparse conversion. Back when implementing sparse2sparse conversion, Aart and I decided it was cleaner to take this approach rather than implement a C++-style iterator, since the iterator was just in service of MLIR codegen rather than ever being used from C++ itself. Fwiw, this is also why the `SparseTensorEnumerator` is designed as it is rather than as a C++-style iterator. Of course, the design tradeoffs are different now that we're factoring things out to be used by C++ itself rather than just by MLIR codegen. After this CL lands I can work on converting this to a traditional C++ iterator. Changing `SparseTensorEnumerator` is a much larger undertaking (and something I'd like to hold off on as long as possible, since it's likely to change substantially once we introduce support for block sparsity etc); but changing `SparseTensorCOO` is simple enough. wrengr: The `SparseTensorCOO` class was originally designed as an entirely internal format, and the…
				aartbikUnsubmitted Done Reply Inline Actions empty line before #endif aartbik: empty line before #endif

mlir/include/mlir/ExecutionEngine/SparseTensor/CheckedMul.h

This file was added.

				//===- CheckedMul.h - multiplication that checks for overflow ---- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This header is not part of the public API. It is placed in the
				// includes directory only because that's required by the implementations
				// of template-classes.
				//
				// This file is part of the lightweight runtime support library for sparse
				// tensor manipulations. The functionality of the support library is meant
				// to simplify benchmarking, testing, and debugging MLIR code operating on
				// sparse tensors. However, the provided functionality is not part of
				// core MLIR itself.
				//
				//===----------------------------------------------------------------------===//

				#ifndef MLIR_EXECUTIONENGINE_SPARSETENSOR_CHECKEDMUL_H
				#define MLIR_EXECUTIONENGINE_SPARSETENSOR_CHECKEDMUL_H

				#include <cassert>
				#include <cinttypes>
				#include <limits>

				namespace mlir {
				namespace sparse_tensor {
				namespace detail {

				/// A version of `operator*` on `uint64_t` which checks for overflows.
				inline uint64_t checkedMul(uint64_t lhs, uint64_t rhs) {
				assert((lhs == 0 \|\| rhs <= std::numeric_limits<uint64_t>::max() / lhs) &&
				aartbikUnsubmitted Not Done Reply Inline Actions I actually don't think we need so much real estate for just this check aartbik: I actually don't think we need so much real estate for just this check
				wrengrAuthorUnsubmitted Done Reply Inline Actions Is having a single header file really that much realestate? I'd rather not remove the checks, since that would introduce a regression against the current code. wrengr: Is having a single header file really that much realestate? I'd rather not remove the checks…
				"Integer overflow");
				return lhs * rhs;
				}

				} // namespace detail
				} // namespace sparse_tensor
				} // namespace mlir

				#endif // MLIR_EXECUTIONENGINE_SPARSETENSOR_CHECKEDMUL_H

mlir/include/mlir/ExecutionEngine/SparseTensor/Enums.h

This file was added.

				//===- Enums.h - Enums shared with the runtime ------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// Typedefs and enums for the lightweight runtime support library for
				// sparse tensor manipulations. These are required to be public so that
				// they can be shared with `Transforms/SparseTensorConversion.cpp`, since
				// they define the arguments to the public functions declared later on.
				//
				// This file also defines x-macros <https://en.wikipedia.org/wiki/X_Macro>
				// so that we can generate variations of the public functions for each
				// supported primary- and/or overhead-type.
				//
				// This file is part of the lightweight runtime support library for sparse
				// tensor manipulations. The functionality of the support library is meant
				// to simplify benchmarking, testing, and debugging MLIR code operating on
				// sparse tensors. However, the provided functionality is not part of
				// core MLIR itself.
				//
				//===----------------------------------------------------------------------===//

				#ifndef MLIR_EXECUTIONENGINE_SPARSETENSOR_ENUMS_H
				#define MLIR_EXECUTIONENGINE_SPARSETENSOR_ENUMS_H

				#include "mlir/ExecutionEngine/Float16bits.h"

				#include <cinttypes>
				#include <complex>

				#ifdef _WIN32
				#ifdef mlir_sparse_tensor_utils_EXPORTS // We are building this library
				#define MLIR_SPARSETENSOR_EXPORT __declspec(dllexport)
				#define MLIR_SPARSETENSOR_DEFINE_FUNCTIONS
				#else // We are using this library
				#define MLIR_SPARSETENSOR_EXPORT __declspec(dllimport)
				#endif // mlir_sparse_tensor_utils_EXPORTS
				#else // Non-windows: use visibility attributes.
				#define MLIR_SPARSETENSOR_EXPORT __attribute__((visibility("default")))
				#define MLIR_SPARSETENSOR_DEFINE_FUNCTIONS
				#endif // _WIN32

				namespace mlir {
				namespace sparse_tensor {

				/// This type is used in the public API at all places where MLIR expects
				/// values with the built-in type "index". For now, we simply assume that
				/// type is 64-bit, but targets with different "index" bit widths should
				/// link with an alternatively built runtime support library.
				// TODO: support such targets?
				using index_type = uint64_t;

				/// Encoding of overhead types (both pointer overhead and indices
				/// overhead), for "overloading" @newSparseTensor.
				enum class MLIR_SPARSETENSOR_EXPORT OverheadType : uint32_t {
				aganeaUnsubmitted Not Done Reply Inline Actions Hello @wrengr! I'm seeing a bunch of warnings like this when building with clang-cl 14.0.6 on Windows: C:/git/llvm-project/mlir/include\mlir/ExecutionEngine/SparseTensor/Enums.h(58,12): warning: 'dllexport' attribute only applies to functions, variables, classes, and Objective-C interfaces [-Wignored-attributes] enum class MLIR_SPARSETENSOR_EXPORT OverheadType : uint32_t { ^ C:/git/llvm-project/mlir/include\mlir/ExecutionEngine/SparseTensor/Enums.h(36,45): note: expanded from macro 'MLIR_SPARSETENSOR_EXPORT' #define MLIR_SPARSETENSOR_EXPORT __declspec(dllexport) ^ The `dllimport/dllexport` can be omitted here on Windows because an `enum` is a type not a symbol. For non-Windows, do you really need the `__attribute__((visibility("default")))` on `enum`s? aganea: Hello @wrengr! I'm seeing a bunch of warnings like this when building with clang-cl 14.0.6 on…
				wrengrAuthorUnsubmitted Done Reply Inline Actions For non-Windows, do you really need the attribute((visibility("default"))) on enums? Probably not! On Linux everything works fine when `MLIR_SPARSETENSOR_EXPORT` expands to nothing (both when used on functions and on enums). This is my first time working with DLLs on Windows, so I just defined the macro following the idiom used elsewhere in the MLIR codebase. For MSVC and MSYS2 I thought the dllexport/dllimport stuff needed to be applied to `enum class` definitions (not just `class`/`struct` definitions), but I may very well be wrong. I'm about to post a differential for trying to resolve some other Windows issues (https://reviews.llvm.org/D134933#3843372). Once that's up could I get you to try building it to make sure it works on your Windows+Clang setup? wrengr: > For non-Windows, do you really need the __attribute__((visibility("default"))) on enums?
				aganeaUnsubmitted Not Done Reply Inline Actions I will certainly try that! Thanks! aganea: I will certainly try that! Thanks!
				wrengrAuthorUnsubmitted Done Reply Inline Actions I just uploaded D135502, so please comment there to let me know how it goes :) wrengr: I just uploaded D135502, so please comment there to let me know how it goes :)
				kIndex = 0,
				kU64 = 1,
				kU32 = 2,
				kU16 = 3,
				kU8 = 4
				};

				// This x-macro calls its argument on every overhead type which has
				// fixed-width. It excludes `index_type` because that type is often
				// handled specially (e.g., by translating it into the architecture-dependent
				// equivalent fixed-width overhead type).
				#define FOREVERY_FIXED_O(DO) \
				DO(64, uint64_t) \
				DO(32, uint32_t) \
				DO(16, uint16_t) \
				DO(8, uint8_t)

				// This x-macro calls its argument on every overhead type, including
				// `index_type`.
				#define FOREVERY_O(DO) \
				FOREVERY_FIXED_O(DO) \
				DO(0, index_type)

				// These are not just shorthands but indicate the particular
				// implementation used (e.g., as opposed to C99's `complex double`,
				// or MLIR's `ComplexType`).
				using complex64 = std::complex<double>;
				using complex32 = std::complex<float>;

				/// Encoding of the elemental type, for "overloading" @newSparseTensor.
				enum class MLIR_SPARSETENSOR_EXPORT PrimaryType : uint32_t {
				kF64 = 1,
				kF32 = 2,
				kF16 = 3,
				kBF16 = 4,
				kI64 = 5,
				kI32 = 6,
				kI16 = 7,
				kI8 = 8,
				kC64 = 9,
				kC32 = 10
				};

				// This x-macro includes all `V` types.
				#define FOREVERY_V(DO) \
				DO(F64, double) \
				DO(F32, float) \
				DO(F16, f16) \
				DO(BF16, bf16) \
				DO(I64, int64_t) \
				DO(I32, int32_t) \
				DO(I16, int16_t) \
				DO(I8, int8_t) \
				DO(C64, complex64) \
				DO(C32, complex32)

				constexpr MLIR_SPARSETENSOR_EXPORT bool
				isFloatingPrimaryType(PrimaryType valTy) {
				return PrimaryType::kF64 <= valTy && valTy <= PrimaryType::kBF16;
				}

				constexpr MLIR_SPARSETENSOR_EXPORT bool
				isIntegralPrimaryType(PrimaryType valTy) {
				return PrimaryType::kI64 <= valTy && valTy <= PrimaryType::kI8;
				}

				constexpr MLIR_SPARSETENSOR_EXPORT bool isRealPrimaryType(PrimaryType valTy) {
				return PrimaryType::kF64 <= valTy && valTy <= PrimaryType::kI8;
				}

				constexpr MLIR_SPARSETENSOR_EXPORT bool
				isComplexPrimaryType(PrimaryType valTy) {
				return PrimaryType::kC64 <= valTy && valTy <= PrimaryType::kC32;
				}

				/// The actions performed by @newSparseTensor.
				enum class MLIR_SPARSETENSOR_EXPORT Action : uint32_t {
				kEmpty = 0,
				kFromFile = 1,
				kFromCOO = 2,
				kSparseToSparse = 3,
				kEmptyCOO = 4,
				kToCOO = 5,
				kToIterator = 6,
				};

				/// This enum mimics `SparseTensorEncodingAttr::DimLevelType` for
				/// breaking dependency cycles. `SparseTensorEncodingAttr::DimLevelType`
				/// is the source of truth and this enum should be kept consistent with it.
				enum class MLIR_SPARSETENSOR_EXPORT DimLevelType : uint8_t {
				aartbikUnsubmitted Done Reply Inline Actions empty line aartbik: empty line
				kDense = 0,
				kCompressed = 1,
				kCompressedNu = 2,
				kCompressedNo = 3,
				kCompressedNuNo = 4,
				kSingleton = 5,
				kSingletonNu = 6,
				kSingletonNo = 7,
				kSingletonNuNo = 8,
				};

				} // namespace sparse_tensor
				} // namespace mlir

				#endif // MLIR_EXECUTIONENGINE_SPARSETENSOR_ENUMS_H

mlir/include/mlir/ExecutionEngine/SparseTensor/ErrorHandling.h

This file was added.

				//===- ErrorHandling.h - Helpers for errors ---------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This header is not part of the public API. It is placed in the
				// includes directory only because that's required by the implementations
				// of template-classes.
				//
				// This file defines an extremely lightweight API for fatal errors (not
				// arising from assertions). The API does not attempt to be sophisticated
				// in any way, it's just the usual "I give up" style of error reporting.
				//
				// This file is part of the lightweight runtime support library for sparse
				// tensor manipulations. The functionality of the support library is meant
				// to simplify benchmarking, testing, and debugging MLIR code operating on
				// sparse tensors. However, the provided functionality is not part of
				// core MLIR itself.
				//
				//===----------------------------------------------------------------------===//

				#ifndef MLIR_EXECUTIONENGINE_SPARSETENSOR_ERRORHANDLING_H
				#define MLIR_EXECUTIONENGINE_SPARSETENSOR_ERRORHANDLING_H

				#include <cstdio>
				#include <cstdlib>

				// This macro helps minimize repetition of this idiom, as well as ensuring
				// we have some additional output indicating where the error is coming from.
				// (Since `fprintf` doesn't provide a stacktrace, this helps make it easier
				// to track down whether an error is coming from our code vs somewhere else
				// in MLIR.)
				pgavinUnsubmitted Done Reply Inline Actions Could this be something like MLIR_SPARSETENSOR_FATAL since this is included in public headers? pgavin: Could this be something like MLIR_SPARSETENSOR_FATAL since this is included in public headers?
				wrengrAuthorUnsubmitted Done Reply Inline Actions Yes of course :) I was hoping to get the functional version below to work before landing this CL, and just forgot to do the renaming when I ran afoul of the `[-Wformat-security]` warnings. wrengr: Yes of course :) I was hoping to get the functional version below to work before landing this…
				#define MLIR_SPARSETENSOR_FATAL(...) \
				do { \
				fprintf(stderr, "SparseTensorUtils: " __VA_ARGS__); \
				exit(1); \
				} while (0)

				#endif // MLIR_EXECUTIONENGINE_SPARSETENSOR_ERRORHANDLING_H
				aartbikUnsubmitted Done Reply Inline Actions don't check in commented out code as a general rule aartbik: don't check in commented out code as a general rule

mlir/include/mlir/ExecutionEngine/SparseTensor/File.h

This file was added.

				//===- File.h - Parsing sparse tensors from files ---------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements parsing and printing of files in one of the
				// following external formats:
				//
				// (1) Matrix Market Exchange (MME): *.mtx
				// https://math.nist.gov/MatrixMarket/formats.html
				//
				// (2) Formidable Repository of Open Sparse Tensors and Tools (FROSTT): *.tns
				// http://frostt.io/tensors/file-formats.html
				//
				// This file is part of the lightweight runtime support library for sparse
				// tensor manipulations. The functionality of the support library is meant
				// to simplify benchmarking, testing, and debugging MLIR code operating on
				// sparse tensors. However, the provided functionality is not part of
				// core MLIR itself.
				//
				//===----------------------------------------------------------------------===//

				#ifndef MLIR_EXECUTIONENGINE_SPARSETENSOR_FILE_H
				#define MLIR_EXECUTIONENGINE_SPARSETENSOR_FILE_H

				#include "mlir/ExecutionEngine/SparseTensor/Storage.h"

				#include <fstream>

				namespace mlir {
				namespace sparse_tensor {

				/// This class abstracts over the information stored in file headers,
				/// as well as providing the buffers and methods for parsing those headers.
				class SparseTensorFile final {
				public:
				enum class ValueKind {
				kInvalid = 0,
				kPattern = 1,
				kReal = 2,
				kInteger = 3,
				kComplex = 4,
				kUndefined = 5
				};

				explicit SparseTensorFile(char *filename) : filename(filename) {
				assert(filename && "Received nullptr for filename");
				}

				// Disallows copying, to avoid duplicating the `file` pointer.
				SparseTensorFile(const SparseTensorFile &) = delete;
				SparseTensorFile &operator=(const SparseTensorFile &) = delete;

				// This dtor tries to avoid leaking the `file`. (Though it's better
				// to call `closeFile` explicitly when possible, since there are
				// circumstances where dtors are not called reliably.)
				~SparseTensorFile() { closeFile(); }

				/// Opens the file for reading.
				void openFile();

				/// Closes the file.
				void closeFile();

				/// Attempts to read a line from the file.
				char *readLine();

				/// Reads and parses the file's header.
				void readHeader();

				ValueKind getValueKind() const { return valueKind_; }

				bool isValid() const { return valueKind_ != ValueKind::kInvalid; }

				/// Gets the MME "pattern" property setting. Is only valid after
				/// parsing the header.
				bool isPattern() const {
				assert(isValid() && "Attempt to isPattern() before readHeader()");
				return valueKind_ == ValueKind::kPattern;
				}

				/// Gets the MME "symmetric" property setting. Is only valid after
				/// parsing the header.
				bool isSymmetric() const {
				assert(isValid() && "Attempt to isSymmetric() before readHeader()");
				return isSymmetric_;
				}

				/// Gets the rank of the tensor. Is only valid after parsing the header.
				uint64_t getRank() const {
				assert(isValid() && "Attempt to getRank() before readHeader()");
				return idata[0];
				}

				/// Gets the number of non-zeros. Is only valid after parsing the header.
				uint64_t getNNZ() const {
				assert(isValid() && "Attempt to getNNZ() before readHeader()");
				return idata[1];
				}

				/// Gets the dimension-sizes array. The pointer itself is always
				/// valid; however, the values stored therein are only valid after
				/// parsing the header.
				const uint64_t *getDimSizes() const { return idata + 2; }

				/// Safely gets the size of the given dimension. Is only valid
				/// after parsing the header.
				uint64_t getDimSize(uint64_t d) const {
				assert(d < getRank());
				return idata[2 + d];
				}

				/// Asserts the shape subsumes the actual dimension sizes. Is only
				/// valid after parsing the header.
				void assertMatchesShape(uint64_t rank, const uint64_t *shape) const;

				private:
				void readMMEHeader();
				void readExtFROSTTHeader();

				static constexpr int kColWidth = 1025;
				const char *filename;
				FILE *file = nullptr;
				ValueKind valueKind_ = ValueKind::kInvalid;
				bool isSymmetric_ = false;
				uint64_t idata[512];
				char line[kColWidth];
				};

				namespace detail {

				// Adds a value to a tensor in coordinate scheme. If is_symmetric_value is true,
				// also adds the value to its symmetric location.
				template <typename T, typename V>
				inline void addValue(T *coo, V value, const std::vector<uint64_t> indices,
				bool is_symmetric_value) {
				// TODO: <https://github.com/llvm/llvm-project/issues/54179>
				coo->add(indices, value);
				// We currently chose to deal with symmetric matrices by fully constructing
				// them. In the future, we may want to make symmetry implicit for storage
				// reasons.
				if (is_symmetric_value)
				coo->add({indices[1], indices[0]}, value);
				}

				// Reads an element of a complex type for the current indices in coordinate
				// scheme.
				template <typename V>
				inline void readCOOValue(SparseTensorCOO<std::complex<V>> *coo,
				const std::vector<uint64_t> indices, char **linePtr,
				bool is_pattern, bool add_symmetric_value) {
				// Read two values to make a complex. The external formats always store
				// numerical values with the type double, but we cast these values to the
				// sparse tensor object type. For a pattern tensor, we arbitrarily pick the
				// value 1 for all entries.
				V re = is_pattern ? 1.0 : strtod(*linePtr, linePtr);
				V im = is_pattern ? 1.0 : strtod(*linePtr, linePtr);
				std::complex<V> value = {re, im};
				addValue(coo, value, indices, add_symmetric_value);
				}

				// Reads an element of a non-complex type for the current indices in coordinate
				// scheme.
				template <typename V,
				typename std::enable_if<
				!std::is_same<std::complex<float>, V>::value &&
				!std::is_same<std::complex<double>, V>::value>::type * = nullptr>
				inline void readCOOValue(SparseTensorCOO<V> *coo,
				const std::vector<uint64_t> indices, char **linePtr,
				bool is_pattern, bool is_symmetric_value) {
				// The external formats always store these numerical values with the type
				// double, but we cast these values to the sparse tensor object type.
				// For a pattern tensor, we arbitrarily pick the value 1 for all entries.
				double value = is_pattern ? 1.0 : strtod(*linePtr, linePtr);
				addValue(coo, value, indices, is_symmetric_value);
				}

				} // namespace detail

				/// Reads a sparse tensor with the given filename into a memory-resident
				/// sparse tensor in coordinate scheme.
				template <typename V>
				inline SparseTensorCOO<V> *
				openSparseTensorCOO(char filename, uint64_t rank, const uint64_t shape,
				const uint64_t *perm, PrimaryType valTp) {
				SparseTensorFile stfile(filename);
				stfile.openFile();
				stfile.readHeader();
				// Check tensor element type against the value type in the input file.
				SparseTensorFile::ValueKind valueKind = stfile.getValueKind();
				bool tensorIsInteger =
				(valTp >= PrimaryType::kI64 && valTp <= PrimaryType::kI8);
				bool tensorIsReal = (valTp >= PrimaryType::kF64 && valTp <= PrimaryType::kI8);
				if ((valueKind == SparseTensorFile::ValueKind::kReal && tensorIsInteger) \|\|
				(valueKind == SparseTensorFile::ValueKind::kComplex && tensorIsReal)) {
				MLIR_SPARSETENSOR_FATAL(
				"Tensor element type %d not compatible with values in file %s\n",
				static_cast<int>(valTp), filename);
				}
				stfile.assertMatchesShape(rank, shape);
				// Prepare sparse tensor object with per-dimension sizes
				// and the number of nonzeros as initial capacity.
				uint64_t nnz = stfile.getNNZ();
				auto *coo = SparseTensorCOO<V>::newSparseTensorCOO(rank, stfile.getDimSizes(),
				perm, nnz);
				// Read all nonzero elements.
				std::vector<uint64_t> indices(rank);
				for (uint64_t k = 0; k < nnz; k++) {
				char *linePtr = stfile.readLine();
				for (uint64_t r = 0; r < rank; r++) {
				uint64_t idx = strtoul(linePtr, &linePtr, 10);
				// Add 0-based index.
				indices[perm[r]] = idx - 1;
				}
				detail::readCOOValue(coo, indices, &linePtr, stfile.isPattern(),
				stfile.isSymmetric() && indices[0] != indices[1]);
				}
				// Close the file and return tensor.
				stfile.closeFile();
				return coo;
				}

				/// Writes the sparse tensor to `dest` in extended FROSTT format.
				template <typename V>
				inline void outSparseTensor(void tensor, void dest, bool sort) {
				assert(tensor && dest);
				auto coo = static_cast<SparseTensorCOO<V> *>(tensor);
				if (sort)
				coo->sort();
				char filename = static_cast<char >(dest);
				auto &dimSizes = coo->getDimSizes();
				auto &elements = coo->getElements();
				uint64_t rank = coo->getRank();
				uint64_t nnz = elements.size();
				std::fstream file;
				file.open(filename, std::ios_base::out \| std::ios_base::trunc);
				assert(file.is_open());
				file << "; extended FROSTT format\n" << rank << " " << nnz << std::endl;
				for (uint64_t r = 0; r < rank - 1; r++)
				file << dimSizes[r] << " ";
				file << dimSizes[rank - 1] << std::endl;
				for (uint64_t i = 0; i < nnz; i++) {
				auto &idx = elements[i].indices;
				for (uint64_t r = 0; r < rank; r++)
				file << (idx[r] + 1) << " ";
				file << elements[i].value << std::endl;
				}
				file.flush();
				file.close();
				assert(file.good());
				}

				} // namespace sparse_tensor
				} // namespace mlir

				#endif // MLIR_EXECUTIONENGINE_SPARSETENSOR_FILE_H
				aartbikUnsubmitted Done Reply Inline Actions extra line before closing #endif aartbik: extra line before closing #endif

mlir/include/mlir/ExecutionEngine/SparseTensor/Storage.h

This file was added.

				//===- Storage.h - TACO-flavored sparse tensor representation ---- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file is part of the lightweight runtime support library for sparse
				// tensor manipulations. The functionality of the support library is meant
				// to simplify benchmarking, testing, and debugging MLIR code operating on
				// sparse tensors. However, the provided functionality is not part of
				// core MLIR itself.
				//
				// This file contains definitions for the following classes:
				//
				// * `SparseTensorStorageBase`
				// * `SparseTensorStorage<P, I, V>`
				// * `SparseTensorEnumeratorBase<V>`
				// * `SparseTensorEnumerator<P, I, V>`
				// * `SparseTensorNNZ`
				//
				// Ideally we would split the storage classes and enumerator classes
				// into separate files, to improve legibility. But alas: because these
				// are template-classes, they must therefore provide definitions in the
				// header; and those definitions cause circular dependencies that make it
				// impossible to split the file up along the desired lines. (We could
				// split the base classes from the derived classes, but that doesn't
				// particularly help improve legibility.)
				//
				//===----------------------------------------------------------------------===//

				#ifndef MLIR_EXECUTIONENGINE_SPARSETENSOR_STORAGE_H
				#define MLIR_EXECUTIONENGINE_SPARSETENSOR_STORAGE_H

				#include "mlir/ExecutionEngine/SparseTensor/COO.h"
				#include "mlir/ExecutionEngine/SparseTensor/CheckedMul.h"
				#include "mlir/ExecutionEngine/SparseTensor/Enums.h"
				#include "mlir/ExecutionEngine/SparseTensor/ErrorHandling.h"

				namespace mlir {
				namespace sparse_tensor {

				namespace detail {

				// TODO: try to unify this with `SparseTensorFile::assertMatchesShape`
				// which is used by `openSparseTensorCOO`. It's easy enough to resolve
				// the `std::vector` vs pointer mismatch for `dimSizes`; but it's trickier
				// to resolve the presence/absence of `perm` (without introducing extra
				// overhead), so perhaps the code duplication is unavoidable.
				//
				/// Asserts that the `dimSizes` (in target-order) under the `perm` (mapping
				/// semantic-order to target-order) are a refinement of the desired `shape`
				/// (in semantic-order).
				///
				/// Precondition: `perm` and `shape` must be valid for `rank`.
				inline void assertPermutedSizesMatchShape(const std::vector<uint64_t> &dimSizes,
				uint64_t rank, const uint64_t *perm,
				const uint64_t *shape) {
				assert(perm && shape);
				assert(rank == dimSizes.size() && "Rank mismatch");
				for (uint64_t r = 0; r < rank; r++)
				assert((shape[r] == 0 \|\| shape[r] == dimSizes[perm[r]]) &&
				"Dimension size mismatch");
				}

				} // namespace detail

				// This forward decl is sufficient to split `SparseTensorStorageBase` into
				// its own header, but isn't sufficient for `SparseTensorStorage` to join it.
				template <typename V>
				class SparseTensorEnumeratorBase;

				/// Abstract base class for `SparseTensorStorage<P,I,V>`. This class
				/// takes responsibility for all the `<P,I,V>`-independent aspects
				/// of the tensor (e.g., shape, sparsity, permutation). In addition,
				/// we use function overloading to implement "partial" method
				/// specialization, which the C-API relies on to catch type errors
				/// arising from our use of opaque pointers.
				class SparseTensorStorageBase {
				public:
				/// Constructs a new storage object. The `perm` maps the tensor's
				/// semantic-ordering of dimensions to this object's storage-order.
				/// The `dimSizes` and `sparsity` arrays are already in storage-order.
				pgavinUnsubmitted Not Done Reply Inline Actions Can these be ArrayRefs also? pgavin: Can these be ArrayRefs also?
				///
				/// Precondition: `perm` and `sparsity` must be valid for `dimSizes.size()`.
				SparseTensorStorageBase(const std::vector<uint64_t> &dimSizes,
				const uint64_t perm, const DimLevelType sparsity);

				virtual ~SparseTensorStorageBase() = default;

				/// Get the rank of the tensor.
				uint64_t getRank() const { return dimSizes.size(); }

				/// Getter for the dimension-sizes array, in storage-order.
				const std::vector<uint64_t> &getDimSizes() const { return dimSizes; }

				/// Safely lookup the size of the given (storage-order) dimension.
				uint64_t getDimSize(uint64_t d) const {
				assert(d < getRank());
				return dimSizes[d];
				}

				/// Getter for the "reverse" permutation, which maps this object's
				/// storage-order to the tensor's semantic-order.
				const std::vector<uint64_t> &getRev() const { return rev; }

				/// Getter for the dimension-types array, in storage-order.
				const std::vector<DimLevelType> &getDimTypes() const { return dimTypes; }

				/// Safely check if the (storage-order) dimension uses dense storage.
				bool isDenseDim(uint64_t d) const {
				assert(d < getRank());
				return dimTypes[d] == DimLevelType::kDense;
				}

				/// Safely check if the (storage-order) dimension uses compressed storage.
				bool isCompressedDim(uint64_t d) const {
				assert(d < getRank());
				switch (dimTypes[d]) {
				case DimLevelType::kCompressed:
				case DimLevelType::kCompressedNu:
				case DimLevelType::kCompressedNo:
				case DimLevelType::kCompressedNuNo:
				return true;
				default:
				return false;
				}
				}

				/// Safely check if the (storage-order) dimension uses singleton storage.
				bool isSingletonDim(uint64_t d) const {
				assert(d < getRank());
				switch (dimTypes[d]) {
				case DimLevelType::kSingleton:
				case DimLevelType::kSingletonNu:
				case DimLevelType::kSingletonNo:
				case DimLevelType::kSingletonNuNo:
				return true;
				default:
				return false;
				}
				}

				/// Safely check if the (storage-order) dimension is ordered.
				bool isOrderedDim(uint64_t d) const {
				assert(d < getRank());
				aartbikUnsubmitted Done Reply Inline Actions thanks for pre-merging this! aartbik: thanks for pre-merging this!
				switch (dimTypes[d]) {
				case DimLevelType::kCompressedNo:
				case DimLevelType::kCompressedNuNo:
				case DimLevelType::kSingletonNo:
				case DimLevelType::kSingletonNuNo:
				return false;
				default:
				return true;
				}
				}

				/// Safely check if the (storage-order) dimension is unique.
				bool isUniqueDim(uint64_t d) const {
				assert(d < getRank());
				switch (dimTypes[d]) {
				case DimLevelType::kCompressedNu:
				case DimLevelType::kCompressedNuNo:
				case DimLevelType::kSingletonNu:
				case DimLevelType::kSingletonNuNo:
				return false;
				default:
				return true;
				}
				}

				/// Allocate a new enumerator.
				#define DECL_NEWENUMERATOR(VNAME, V) \
				virtual void newEnumerator(SparseTensorEnumeratorBase<V> **, uint64_t, \
				const uint64_t *) const;
				FOREVERY_V(DECL_NEWENUMERATOR)
				#undef DECL_NEWENUMERATOR

				/// Overhead storage.
				#define DECL_GETPOINTERS(PNAME, P) \
				virtual void getPointers(std::vector<P> **, uint64_t);
				FOREVERY_FIXED_O(DECL_GETPOINTERS)
				#undef DECL_GETPOINTERS
				#define DECL_GETINDICES(INAME, I) \
				virtual void getIndices(std::vector<I> **, uint64_t);
				FOREVERY_FIXED_O(DECL_GETINDICES)
				#undef DECL_GETINDICES

				/// Primary storage.
				#define DECL_GETVALUES(VNAME, V) virtual void getValues(std::vector<V> **);
				FOREVERY_V(DECL_GETVALUES)
				#undef DECL_GETVALUES

				/// Element-wise insertion in lexicographic index order.
				#define DECL_LEXINSERT(VNAME, V) virtual void lexInsert(const uint64_t *, V);
				FOREVERY_V(DECL_LEXINSERT)
				#undef DECL_LEXINSERT

				/// Expanded insertion.
				#define DECL_EXPINSERT(VNAME, V) \
				virtual void expInsert(uint64_t , V , bool , uint64_t , uint64_t);
				FOREVERY_V(DECL_EXPINSERT)
				#undef DECL_EXPINSERT

				/// Finishes insertion.
				virtual void endInsert() = 0;

				protected:
				// Since this class is virtual, we must disallow public copying in
				// order to avoid "slicing". Since this class has data members,
				// that means making copying protected.
				// <https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rc-copy-virtual>
				SparseTensorStorageBase(const SparseTensorStorageBase &) = default;
				// Copy-assignment would be implicitly deleted (because `dimSizes`
				// is const), so we explicitly delete it for clarity.
				SparseTensorStorageBase &operator=(const SparseTensorStorageBase &) = delete;

				private:
				const std::vector<uint64_t> dimSizes;
				std::vector<uint64_t> rev;
				const std::vector<DimLevelType> dimTypes;
				};

				// This forward decl is necessary for defining `SparseTensorStorage`,
				// but isn't sufficient for splitting it off.
				template <typename P, typename I, typename V>
				class SparseTensorEnumerator;

				/// A memory-resident sparse tensor using a storage scheme based on
				/// per-dimension sparse/dense annotations. This data structure provides a
				/// bufferized form of a sparse tensor type. In contrast to generating setup
				/// methods for each differently annotated sparse tensor, this method provides
				/// a convenient "one-size-fits-all" solution that simply takes an input tensor
				/// and annotations to implement all required setup in a general manner.
				template <typename P, typename I, typename V>
				class SparseTensorStorage final : public SparseTensorStorageBase {
				/// Private constructor to share code between the other constructors.
				/// Beware that the object is not necessarily guaranteed to be in a
				/// valid state after this constructor alone; e.g., `isCompressedDim(d)`
				/// doesn't entail `!(pointers[d].empty())`.
				///
				/// Precondition: `perm` and `sparsity` must be valid for `dimSizes.size()`.
				SparseTensorStorage(const std::vector<uint64_t> &dimSizes,
				const uint64_t perm, const DimLevelType sparsity)
				: SparseTensorStorageBase(dimSizes, perm, sparsity), pointers(getRank()),
				indices(getRank()), idx(getRank()) {}

				public:
				/// Constructs a sparse tensor storage scheme with the given dimensions,
				/// permutation, and per-dimension dense/sparse annotations, using
				/// the coordinate scheme tensor for the initial contents if provided.
				///
				/// Precondition: `perm` and `sparsity` must be valid for `dimSizes.size()`.
				SparseTensorStorage(const std::vector<uint64_t> &dimSizes,
				const uint64_t perm, const DimLevelType sparsity,
				SparseTensorCOO<V> *coo)
				: SparseTensorStorage(dimSizes, perm, sparsity) {
				// Provide hints on capacity of pointers and indices.
				// TODO: needs much fine-tuning based on actual sparsity; currently
				// we reserve pointer/index space based on all previous dense
				// dimensions, which works well up to first sparse dim; but
				// we should really use nnz and dense/sparse distribution.
				bool allDense = true;
				uint64_t sz = 1;
				for (uint64_t r = 0, rank = getRank(); r < rank; r++) {
				if (isCompressedDim(r)) {
				// TODO: Take a parameter between 1 and `dimSizes[r]`, and multiply
				// `sz` by that before reserving. (For now we just use 1.)
				pointers[r].reserve(sz + 1);
				pointers[r].push_back(0);
				indices[r].reserve(sz);
				sz = 1;
				allDense = false;
				} else if (isSingletonDim(r)) {
				indices[r].reserve(sz);
				sz = 1;
				allDense = false;
				} else { // Dense dimension.
				assert(isDenseDim(r));
				sz = detail::checkedMul(sz, getDimSizes()[r]);
				}
				}
				// Then assign contents from coordinate scheme tensor if provided.
				if (coo) {
				// Ensure both preconditions of `fromCOO`.
				assert(coo->getDimSizes() == getDimSizes() && "Tensor size mismatch");
				coo->sort();
				// Now actually insert the `elements`.
				const std::vector<Element<V>> &elements = coo->getElements();
				uint64_t nnz = elements.size();
				values.reserve(nnz);
				fromCOO(elements, 0, nnz, 0);
				} else if (allDense) {
				values.resize(sz, 0);
				}
				}

				/// Constructs a sparse tensor storage scheme with the given dimensions,
				/// permutation, and per-dimension dense/sparse annotations, using
				/// the given sparse tensor for the initial contents.
				///
				/// Preconditions:
				/// * `perm` and `sparsity` must be valid for `dimSizes.size()`.
				/// * The `tensor` must have the same value type `V`.
				SparseTensorStorage(const std::vector<uint64_t> &dimSizes,
				const uint64_t perm, const DimLevelType sparsity,
				const SparseTensorStorageBase &tensor);

				~SparseTensorStorage() final = default;

				/// Partially specialize these getter methods based on template types.
				void getPointers(std::vector<P> **out, uint64_t d) final {
				assert(d < getRank());
				*out = &pointers[d];
				}
				void getIndices(std::vector<I> **out, uint64_t d) final {
				assert(d < getRank());
				*out = &indices[d];
				}
				void getValues(std::vector<V> *out) final { out = &values; }

				/// Partially specialize lexicographical insertions based on template types.
				void lexInsert(const uint64_t *cursor, V val) final {
				// First, wrap up pending insertion path.
				uint64_t diff = 0;
				uint64_t top = 0;
				if (!values.empty()) {
				diff = lexDiff(cursor);
				endPath(diff + 1);
				top = idx[diff] + 1;
				}
				// Then continue with insertion path.
				insPath(cursor, diff, top, val);
				}

				/// Partially specialize expanded insertions based on template types.
				/// Note that this method resets the values/filled-switch array back
				/// to all-zero/false while only iterating over the nonzero elements.
				void expInsert(uint64_t cursor, V values, bool filled, uint64_t added,
				uint64_t count) final {
				if (count == 0)
				return;
				// Sort.
				std::sort(added, added + count);
				// Restore insertion path for first insert.
				const uint64_t lastDim = getRank() - 1;
				uint64_t index = added[0];
				cursor[lastDim] = index;
				lexInsert(cursor, values[index]);
				assert(filled[index]);
				values[index] = 0;
				filled[index] = false;
				// Subsequent insertions are quick.
				for (uint64_t i = 1; i < count; i++) {
				assert(index < added[i] && "non-lexicographic insertion");
				index = added[i];
				cursor[lastDim] = index;
				insPath(cursor, lastDim, added[i - 1] + 1, values[index]);
				assert(filled[index]);
				values[index] = 0;
				filled[index] = false;
				}
				}

				/// Finalizes lexicographic insertions.
				void endInsert() final {
				if (values.empty())
				finalizeSegment(0);
				else
				endPath(0);
				}

				void newEnumerator(SparseTensorEnumeratorBase<V> **out, uint64_t rank,
				const uint64_t *perm) const final {
				out = new SparseTensorEnumerator<P, I, V>(this, rank, perm);
				}

				/// Returns this sparse tensor storage scheme as a new memory-resident
				/// sparse tensor in coordinate scheme with the given dimension order.
				///
				/// Precondition: `perm` must be valid for `getRank()`.
				SparseTensorCOO<V> toCOO(const uint64_t perm) const {
				SparseTensorEnumeratorBase<V> *enumerator;
				newEnumerator(&enumerator, getRank(), perm);
				SparseTensorCOO<V> *coo =
				new SparseTensorCOO<V>(enumerator->permutedSizes(), values.size());
				enumerator->forallElements([&coo](const std::vector<uint64_t> &ind, V val) {
				coo->add(ind, val);
				});
				// TODO: This assertion assumes there are no stored zeros,
				// or if there are then that we don't filter them out.
				// Cf., <https://github.com/llvm/llvm-project/issues/54179>
				assert(coo->getElements().size() == values.size());
				delete enumerator;
				return coo;
				}

				/// Factory method. Constructs a sparse tensor storage scheme with the given
				/// dimensions, permutation, and per-dimension dense/sparse annotations,
				/// using the coordinate scheme tensor for the initial contents if provided.
				/// In the latter case, the coordinate scheme must respect the same
				/// permutation as is desired for the new sparse tensor storage.
				///
				/// Precondition: `shape`, `perm`, and `sparsity` must be valid for `rank`.
				static SparseTensorStorage<P, I, V> *
				newSparseTensor(uint64_t rank, const uint64_t shape, const uint64_t perm,
				const DimLevelType sparsity, SparseTensorCOO<V> coo) {
				SparseTensorStorage<P, I, V> *n = nullptr;
				if (coo) {
				const auto &coosz = coo->getDimSizes();
				detail::assertPermutedSizesMatchShape(coosz, rank, perm, shape);
				n = new SparseTensorStorage<P, I, V>(coosz, perm, sparsity, coo);
				} else {
				std::vector<uint64_t> permsz(rank);
				for (uint64_t r = 0; r < rank; r++) {
				assert(shape[r] > 0 && "Dimension size zero has trivial storage");
				permsz[perm[r]] = shape[r];
				}
				// We pass the null `coo` to ensure we select the intended constructor.
				n = new SparseTensorStorage<P, I, V>(permsz, perm, sparsity, coo);
				}
				return n;
				}

				/// Factory method. Constructs a sparse tensor storage scheme with
				/// the given dimensions, permutation, and per-dimension dense/sparse
				/// annotations, using the sparse tensor for the initial contents.
				///
				/// Preconditions:
				/// * `shape`, `perm`, and `sparsity` must be valid for `rank`.
				/// * The `tensor` must have the same value type `V`.
				static SparseTensorStorage<P, I, V> *
				newSparseTensor(uint64_t rank, const uint64_t shape, const uint64_t perm,
				const DimLevelType *sparsity,
				const SparseTensorStorageBase *source) {
				assert(source && "Got nullptr for source");
				SparseTensorEnumeratorBase<V> *enumerator;
				source->newEnumerator(&enumerator, rank, perm);
				const auto &permsz = enumerator->permutedSizes();
				detail::assertPermutedSizesMatchShape(permsz, rank, perm, shape);
				auto *tensor =
				new SparseTensorStorage<P, I, V>(permsz, perm, sparsity, *source);
				delete enumerator;
				return tensor;
				}

				private:
				/// Appends an arbitrary new position to `pointers[d]`. This method
				/// checks that `pos` is representable in the `P` type; however, it
				/// does not check that `pos` is semantically valid (i.e., larger than
				/// the previous position and smaller than `indices[d].capacity()`).
				void appendPointer(uint64_t d, uint64_t pos, uint64_t count = 1) {
				assert(isCompressedDim(d));
				assert(pos <= std::numeric_limits<P>::max() &&
				"Pointer value is too large for the P-type");
				pointers[d].insert(pointers[d].end(), count, static_cast<P>(pos));
				}

				/// Appends index `i` to dimension `d`, in the semantically general
				/// sense. For non-dense dimensions, that means appending to the
				/// `indices[d]` array, checking that `i` is representable in the `I`
				/// type; however, we do not verify other semantic requirements (e.g.,
				/// that `i` is in bounds for `dimSizes[d]`, and not previously occurring
				/// in the same segment). For dense dimensions, this method instead
				/// appends the appropriate number of zeros to the `values` array,
				/// where `full` is the number of "entries" already written to `values`
				/// for this segment (aka one after the highest index previously appended).
				void appendIndex(uint64_t d, uint64_t full, uint64_t i) {
				if (isCompressedDim(d) \|\| isSingletonDim(d)) {
				assert(i <= std::numeric_limits<I>::max() &&
				"Index value is too large for the I-type");
				indices[d].push_back(static_cast<I>(i));
				} else { // Dense dimension.
				assert(isDenseDim(d));
				assert(i >= full && "Index was already filled");
				if (i == full)
				return; // Short-circuit, since it'll be a nop.
				if (d + 1 == getRank())
				values.insert(values.end(), i - full, 0);
				else
				finalizeSegment(d + 1, 0, i - full);
				}
				}

				/// Writes the given coordinate to `indices[d][pos]`. This method
				/// checks that `i` is representable in the `I` type; however, it
				/// does not check that `i` is semantically valid (i.e., in bounds
				/// for `dimSizes[d]` and not elsewhere occurring in the same segment).
				void writeIndex(uint64_t d, uint64_t pos, uint64_t i) {
				assert(isCompressedDim(d));
				// Subscript assignment to `std::vector` requires that the `pos`-th
				// entry has been initialized; thus we must be sure to check `size()`
				// here, instead of `capacity()` as would be ideal.
				assert(pos < indices[d].size() && "Index position is out of bounds");
				assert(i <= std::numeric_limits<I>::max() &&
				"Index value is too large for the I-type");
				indices[d][pos] = static_cast<I>(i);
				}

				/// Computes the assembled-size associated with the `d`-th dimension,
				/// given the assembled-size associated with the `(d-1)`-th dimension.
				/// "Assembled-sizes" correspond to the (nominal) sizes of overhead
				/// storage, as opposed to "dimension-sizes" which are the cardinality
				/// of coordinates for that dimension.
				///
				/// Precondition: the `pointers[d]` array must be fully initialized
				/// before calling this method.
				uint64_t assembledSize(uint64_t parentSz, uint64_t d) const {
				if (isCompressedDim(d))
				return pointers[d][parentSz];
				// else if dense:
				return parentSz * getDimSizes()[d];
				}

				/// Initializes sparse tensor storage scheme from a memory-resident sparse
				/// tensor in coordinate scheme. This method prepares the pointers and
				/// indices arrays under the given per-dimension dense/sparse annotations.
				///
				/// Preconditions:
				/// (1) the `elements` must be lexicographically sorted.
				/// (2) the indices of every element are valid for `dimSizes` (equal rank
				/// and pointwise less-than).
				void fromCOO(const std::vector<Element<V>> &elements, uint64_t lo,
				uint64_t hi, uint64_t d) {
				uint64_t rank = getRank();
				assert(d <= rank && hi <= elements.size());
				// Once dimensions are exhausted, insert the numerical values.
				if (d == rank) {
				assert(lo < hi);
				values.push_back(elements[lo].value);
				return;
				}
				// Visit all elements in this interval.
				uint64_t full = 0;
				while (lo < hi) { // If `hi` is unchanged, then `lo < elements.size()`.
				// Find segment in interval with same index elements in this dimension.
				uint64_t i = elements[lo].indices[d];
				uint64_t seg = lo + 1;
				bool merge = isUniqueDim(d);
				while (merge && seg < hi && elements[seg].indices[d] == i)
				seg++;
				// Handle segment in interval for sparse or dense dimension.
				appendIndex(d, full, i);
				full = i + 1;
				fromCOO(elements, lo, seg, d + 1);
				// And move on to next segment in interval.
				lo = seg;
				}
				// Finalize the sparse pointer structure at this dimension.
				finalizeSegment(d, full);
				}

				/// Finalize the sparse pointer structure at this dimension.
				void finalizeSegment(uint64_t d, uint64_t full = 0, uint64_t count = 1) {
				if (count == 0)
				return; // Short-circuit, since it'll be a nop.
				if (isCompressedDim(d)) {
				appendPointer(d, indices[d].size(), count);
				} else if (isSingletonDim(d)) {
				return;
				} else { // Dense dimension.
				assert(isDenseDim(d));
				const uint64_t sz = getDimSizes()[d];
				assert(sz >= full && "Segment is overfull");
				count = detail::checkedMul(count, sz - full);
				// For dense storage we must enumerate all the remaining coordinates
				// in this dimension (i.e., coordinates after the last non-zero
				// element), and either fill in their zero values or else recurse
				// to finalize some deeper dimension.
				if (d + 1 == getRank())
				values.insert(values.end(), count, 0);
				else
				finalizeSegment(d + 1, 0, count);
				}
				}

				/// Wraps up a single insertion path, inner to outer.
				void endPath(uint64_t diff) {
				uint64_t rank = getRank();
				assert(diff <= rank);
				for (uint64_t i = 0; i < rank - diff; i++) {
				const uint64_t d = rank - i - 1;
				finalizeSegment(d, idx[d] + 1);
				}
				}

				/// Continues a single insertion path, outer to inner.
				void insPath(const uint64_t *cursor, uint64_t diff, uint64_t top, V val) {
				uint64_t rank = getRank();
				assert(diff < rank);
				for (uint64_t d = diff; d < rank; d++) {
				uint64_t i = cursor[d];
				appendIndex(d, top, i);
				top = 0;
				idx[d] = i;
				}
				values.push_back(val);
				}

				/// Finds the lexicographic differing dimension.
				uint64_t lexDiff(const uint64_t *cursor) const {
				for (uint64_t r = 0, rank = getRank(); r < rank; r++)
				if (cursor[r] > idx[r])
				return r;
				else
				assert(cursor[r] == idx[r] && "non-lexicographic insertion");
				assert(0 && "duplication insertion");
				return -1u;
				}

				// Allow `SparseTensorEnumerator` to access the data-members (to avoid
				// the cost of virtual-function dispatch in inner loops), without
				// making them public to other client code.
				friend class SparseTensorEnumerator<P, I, V>;

				std::vector<std::vector<P>> pointers;
				std::vector<std::vector<I>> indices;
				std::vector<V> values;
				std::vector<uint64_t> idx; // index cursor for lexicographic insertion.
				};

				/// A (higher-order) function object for enumerating the elements of some
				/// `SparseTensorStorage` under a permutation. That is, the `forallElements`
				/// method encapsulates the loop-nest for enumerating the elements of
				/// the source tensor (in whatever order is best for the source tensor),
				/// and applies a permutation to the coordinates/indices before handing
				/// each element to the callback. A single enumerator object can be
				/// freely reused for several calls to `forallElements`, just so long
				/// as each call is sequential with respect to one another.
				///
				/// N.B., this class stores a reference to the `SparseTensorStorageBase`
				/// passed to the constructor; thus, objects of this class must not
				/// outlive the sparse tensor they depend on.
				///
				/// Design Note: The reason we define this class instead of simply using
				/// `SparseTensorEnumerator<P,I,V>` is because we need to hide/generalize
				/// the `<P,I>` template parameters from MLIR client code (to simplify the
				/// type parameters used for direct sparse-to-sparse conversion). And the
				/// reason we define the `SparseTensorEnumerator<P,I,V>` subclasses rather
				/// than simply using this class, is to avoid the cost of virtual-method
				/// dispatch within the loop-nest.
				template <typename V>
				class SparseTensorEnumeratorBase {
				public:
				/// Constructs an enumerator with the given permutation for mapping
				/// the semantic-ordering of dimensions to the desired target-ordering.
				///
				/// Preconditions:
				/// * the `tensor` must have the same `V` value type.
				/// * `perm` must be valid for `rank`.
				SparseTensorEnumeratorBase(const SparseTensorStorageBase &tensor,
				uint64_t rank, const uint64_t *perm)
				: src(tensor), permsz(src.getRev().size()), reord(getRank()),
				cursor(getRank()) {
				assert(perm && "Received nullptr for permutation");
				assert(rank == getRank() && "Permutation rank mismatch");
				const auto &rev = src.getRev(); // source-order -> semantic-order
				const auto &dimSizes = src.getDimSizes(); // in source storage-order
				for (uint64_t s = 0; s < rank; s++) { // `s` source storage-order
				uint64_t t = perm[rev[s]]; // `t` target-order
				reord[s] = t;
				permsz[t] = dimSizes[s];
				}
				}

				virtual ~SparseTensorEnumeratorBase() = default;

				// We disallow copying to help avoid leaking the `src` reference.
				// (In addition to avoiding the problem of slicing.)
				SparseTensorEnumeratorBase(const SparseTensorEnumeratorBase &) = delete;
				SparseTensorEnumeratorBase &
				operator=(const SparseTensorEnumeratorBase &) = delete;

				/// Returns the source/target tensor's rank. (The source-rank and
				/// target-rank are always equal since we only support permutations.
				/// Though once we add support for other dimension mappings, this
				/// method will have to be split in two.)
				uint64_t getRank() const { return permsz.size(); }

				/// Returns the target tensor's dimension sizes.
				const std::vector<uint64_t> &permutedSizes() const { return permsz; }

				/// Enumerates all elements of the source tensor, permutes their
				/// indices, and passes the permuted element to the callback.
				/// The callback must not store the cursor reference directly,
				/// since this function reuses the storage. Instead, the callback
				/// must copy it if they want to keep it.
				virtual void forallElements(ElementConsumer<V> yield) = 0;

				protected:
				const SparseTensorStorageBase &src;
				std::vector<uint64_t> permsz; // in target order.
				std::vector<uint64_t> reord; // source storage-order -> target order.
				std::vector<uint64_t> cursor; // in target order.
				};

				template <typename P, typename I, typename V>
				class SparseTensorEnumerator final : public SparseTensorEnumeratorBase<V> {
				using Base = SparseTensorEnumeratorBase<V>;

				public:
				/// Constructs an enumerator with the given permutation for mapping
				/// the semantic-ordering of dimensions to the desired target-ordering.
				///
				/// Precondition: `perm` must be valid for `rank`.
				SparseTensorEnumerator(const SparseTensorStorage<P, I, V> &tensor,
				uint64_t rank, const uint64_t *perm)
				: Base(tensor, rank, perm) {}

				~SparseTensorEnumerator() final = default;

				void forallElements(ElementConsumer<V> yield) final {
				forallElements(yield, 0, 0);
				}

				private:
				/// The recursive component of the public `forallElements`.
				void forallElements(ElementConsumer<V> yield, uint64_t parentPos,
				uint64_t d) {
				// Recover the `<P,I,V>` type parameters of `src`.
				const auto &src =
				static_cast<const SparseTensorStorage<P, I, V> &>(this->src);
				if (d == Base::getRank()) {
				assert(parentPos < src.values.size() &&
				"Value position is out of bounds");
				// TODO: <https://github.com/llvm/llvm-project/issues/54179>
				yield(this->cursor, src.values[parentPos]);
				} else if (src.isCompressedDim(d)) {
				// Look up the bounds of the `d`-level segment determined by the
				// `d-1`-level position `parentPos`.
				const std::vector<P> &pointersD = src.pointers[d];
				assert(parentPos + 1 < pointersD.size() &&
				"Parent pointer position is out of bounds");
				const uint64_t pstart = static_cast<uint64_t>(pointersD[parentPos]);
				const uint64_t pstop = static_cast<uint64_t>(pointersD[parentPos + 1]);
				// Loop-invariant code for looking up the `d`-level coordinates/indices.
				const std::vector<I> &indicesD = src.indices[d];
				assert(pstop <= indicesD.size() && "Index position is out of bounds");
				uint64_t &cursorReordD = this->cursor[this->reord[d]];
				for (uint64_t pos = pstart; pos < pstop; pos++) {
				cursorReordD = static_cast<uint64_t>(indicesD[pos]);
				forallElements(yield, pos, d + 1);
				}
				} else if (src.isSingletonDim(d)) {
				MLIR_SPARSETENSOR_FATAL("unsupported dimension level type");
				} else { // Dense dimension.
				assert(src.isDenseDim(d));
				const uint64_t sz = src.getDimSizes()[d];
				const uint64_t pstart = parentPos * sz;
				uint64_t &cursorReordD = this->cursor[this->reord[d]];
				for (uint64_t i = 0; i < sz; i++) {
				cursorReordD = i;
				forallElements(yield, pstart + i, d + 1);
				}
				}
				}
				};

				/// Statistics regarding the number of nonzero subtensors in
				/// a source tensor, for direct sparse=>sparse conversion a la
				/// <https://arxiv.org/abs/2001.02609>.
				///
				/// N.B., this class stores references to the parameters passed to
				/// the constructor; thus, objects of this class must not outlive
				/// those parameters.
				class SparseTensorNNZ final {
				public:
				/// Allocate the statistics structure for the desired sizes and
				/// sparsity (in the target tensor's storage-order). This constructor
				/// does not actually populate the statistics, however; for that see
				/// `initialize`.
				///
				/// Precondition: `dimSizes` must not contain zeros.
				SparseTensorNNZ(const std::vector<uint64_t> &dimSizes,
				const std::vector<DimLevelType> &sparsity);

				// We disallow copying to help avoid leaking the stored references.
				SparseTensorNNZ(const SparseTensorNNZ &) = delete;
				SparseTensorNNZ &operator=(const SparseTensorNNZ &) = delete;

				/// Returns the rank of the target tensor.
				uint64_t getRank() const { return dimSizes.size(); }

				/// Enumerate the source tensor to fill in the statistics. The
				/// enumerator should already incorporate the permutation (from
				/// semantic-order to the target storage-order).
				template <typename V>
				void initialize(SparseTensorEnumeratorBase<V> &enumerator) {
				assert(enumerator.getRank() == getRank() && "Tensor rank mismatch");
				assert(enumerator.permutedSizes() == dimSizes && "Tensor size mismatch");
				enumerator.forallElements(
				[this](const std::vector<uint64_t> &ind, V) { add(ind); });
				}

				/// The type of callback functions which receive an nnz-statistic.
				using NNZConsumer = const std::function<void(uint64_t)> &;

				/// Lexicographically enumerates all indicies for dimensions strictly
				/// less than `stopDim`, and passes their nnz statistic to the callback.
				/// Since our use-case only requires the statistic not the coordinates
				/// themselves, we do not bother to construct those coordinates.
				void forallIndices(uint64_t stopDim, NNZConsumer yield) const;

				private:
				/// Adds a new element (i.e., increment its statistics). We use
				/// a method rather than inlining into the lambda in `initialize`,
				/// to avoid spurious templating over `V`. And this method is private
				/// to avoid needing to re-assert validity of `ind` (which is guaranteed
				/// by `forallElements`).
				void add(const std::vector<uint64_t> &ind);

				/// Recursive component of the public `forallIndices`.
				void forallIndices(NNZConsumer yield, uint64_t stopDim, uint64_t parentPos,
				uint64_t d) const;

				// All of these are in the target storage-order.
				const std::vector<uint64_t> &dimSizes;
				const std::vector<DimLevelType> &dimTypes;
				std::vector<std::vector<uint64_t>> nnz;
				};

				template <typename P, typename I, typename V>
				SparseTensorStorage<P, I, V>::SparseTensorStorage(
				const std::vector<uint64_t> &dimSizes, const uint64_t *perm,
				const DimLevelType *sparsity, const SparseTensorStorageBase &tensor)
				: SparseTensorStorage(dimSizes, perm, sparsity) {
				SparseTensorEnumeratorBase<V> *enumerator;
				tensor.newEnumerator(&enumerator, getRank(), perm);
				{
				// Initialize the statistics structure.
				SparseTensorNNZ nnz(getDimSizes(), getDimTypes());
				nnz.initialize(*enumerator);
				// Initialize "pointers" overhead (and allocate "indices", "values").
				uint64_t parentSz = 1; // assembled-size (not dimension-size) of `r-1`.
				for (uint64_t rank = getRank(), r = 0; r < rank; r++) {
				if (isCompressedDim(r)) {
				pointers[r].reserve(parentSz + 1);
				pointers[r].push_back(0);
				uint64_t currentPos = 0;
				nnz.forallIndices(r, [this, &currentPos, r](uint64_t n) {
				currentPos += n;
				appendPointer(r, currentPos);
				});
				assert(pointers[r].size() == parentSz + 1 &&
				"Final pointers size doesn't match allocated size");
				// That assertion entails `assembledSize(parentSz, r)`
				// is now in a valid state. That is, `pointers[r][parentSz]`
				// equals the present value of `currentPos`, which is the
				// correct assembled-size for `indices[r]`.
				}
				// Update assembled-size for the next iteration.
				parentSz = assembledSize(parentSz, r);
				// Ideally we need only `indices[r].reserve(parentSz)`, however
				// the `std::vector` implementation forces us to initialize it too.
				// That is, in the yieldPos loop we need random-access assignment
				// to `indices[r]`; however, `std::vector`'s subscript-assignment
				// only allows assigning to already-initialized positions.
				if (isCompressedDim(r))
				indices[r].resize(parentSz, 0);
				}
				values.resize(parentSz, 0); // Both allocate and zero-initialize.
				}
				// The yieldPos loop
				enumerator->forallElements([this](const std::vector<uint64_t> &ind, V val) {
				uint64_t parentSz = 1, parentPos = 0;
				for (uint64_t rank = getRank(), r = 0; r < rank; r++) {
				if (isCompressedDim(r)) {
				// If `parentPos == parentSz` then it's valid as an array-lookup;
				// however, it's semantically invalid here since that entry
				// does not represent a segment of `indices[r]`. Moreover, that
				// entry must be immutable for `assembledSize` to remain valid.
				assert(parentPos < parentSz && "Pointers position is out of bounds");
				const uint64_t currentPos = pointers[r][parentPos];
				// This increment won't overflow the `P` type, since it can't
				// exceed the original value of `pointers[r][parentPos+1]`
				// which was already verified to be within bounds for `P`
				// when it was written to the array.
				pointers[r][parentPos]++;
				writeIndex(r, currentPos, ind[r]);
				parentPos = currentPos;
				} else if (isSingletonDim(r)) {
				aartbikUnsubmitted Done Reply Inline Actions empty line aartbik: empty line
				// the new parentPos equals the old parentPos.
				} else { // Dense dimension.
				assert(isDenseDim(r));
				parentPos = parentPos * getDimSizes()[r] + ind[r];
				}
				parentSz = assembledSize(parentSz, r);
				}
				assert(parentPos < values.size() && "Value position is out of bounds");
				values[parentPos] = val;
				});
				// No longer need the enumerator, so we'll delete it ASAP.
				delete enumerator;
				// The finalizeYieldPos loop
				for (uint64_t parentSz = 1, rank = getRank(), r = 0; r < rank; r++) {
				if (isCompressedDim(r)) {
				assert(parentSz == pointers[r].size() - 1 &&
				"Actual pointers size doesn't match the expected size");
				// Can't check all of them, but at least we can check the last one.
				assert(pointers[r][parentSz - 1] == pointers[r][parentSz] &&
				"Pointers got corrupted");
				// TODO: optimize this by using `memmove` or similar.
				for (uint64_t n = 0; n < parentSz; n++) {
				const uint64_t parentPos = parentSz - n;
				pointers[r][parentPos] = pointers[r][parentPos - 1];
				}
				pointers[r][0] = 0;
				}
				parentSz = assembledSize(parentSz, r);
				}
				}

				} // namespace sparse_tensor
				} // namespace mlir

				#endif // MLIR_EXECUTIONENGINE_SPARSETENSOR_STORAGE_H

mlir/include/mlir/ExecutionEngine/SparseTensorUtils.h

//===- SparseTensorUtils.h - Enums shared with the runtime ------- C++ --===//		//===- SparseTensorUtils.h - SparseTensor runtime support lib ---- C++ --===//
		aartbikUnsubmitted Done Reply Inline Actions This reference to enums seems very outdated now aartbik: This reference to enums seems very outdated now
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This header file provides the enums and functions which comprise the		// This header file provides the enums and functions which comprise the
// public API of `ExecutionEngine/SparseUtils.cpp`.		// public API of the `ExecutionEngine/SparseTensorUtils.cpp` runtime
		// support library for the SparseTensor dialect.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef MLIR_EXECUTIONENGINE_SPARSETENSORUTILS_H_		#ifndef MLIR_EXECUTIONENGINE_SPARSETENSORUTILS_H
#define MLIR_EXECUTIONENGINE_SPARSETENSORUTILS_H_		#define MLIR_EXECUTIONENGINE_SPARSETENSORUTILS_H

#include "mlir/ExecutionEngine/CRunnerUtils.h"		#include "mlir/ExecutionEngine/CRunnerUtils.h"
#include "mlir/ExecutionEngine/Float16bits.h"		#include "mlir/ExecutionEngine/SparseTensor/Enums.h"

#include <cinttypes>		#include <cinttypes>
#include <complex>		#include <complex>
#include <vector>		#include <vector>

extern "C" {		using namespace mlir::sparse_tensor;
		pgavinUnsubmitted Not Done Reply Inline Actions I think even a namespace, extern "C" causes these symbols to share the global namespace. It might be better to remove the namespace and prefix each symbol to avoid collisions? pgavin: I think even a namespace, extern "C" causes these symbols to share the global namespace. It…

//===----------------------------------------------------------------------===//
//
// Typedefs and enums. These are required to be public so that they
// can be shared with `Transforms/SparseTensorConversion.cpp`, since
// they define the arguments to the public functions declared later on.
//
// This section also defines x-macros <https://en.wikipedia.org/wiki/X_Macro>
// so that we can generate variations of the public functions for each
// supported primary- and/or overhead-type.
//
//===----------------------------------------------------------------------===//

/// This type is used in the public API at all places where MLIR expects		extern "C" {
/// values with the built-in type "index". For now, we simply assume that
/// type is 64-bit, but targets with different "index" bit widths should
/// link with an alternatively built runtime support library.
// TODO: support such targets?
using index_type = uint64_t;

/// Encoding of overhead types (both pointer overhead and indices
/// overhead), for "overloading" @newSparseTensor.
enum class OverheadType : uint32_t {
kIndex = 0,
kU64 = 1,
kU32 = 2,
kU16 = 3,
kU8 = 4
};

// This x-macro calls its argument on every overhead type which has
// fixed-width. It excludes `index_type` because that type is often
// handled specially (e.g., by translating it into the architecture-dependent
// equivalent fixed-width overhead type).
#define FOREVERY_FIXED_O(DO) \
DO(64, uint64_t) \
DO(32, uint32_t) \
DO(16, uint16_t) \
DO(8, uint8_t)

// This x-macro calls its argument on every overhead type, including
// `index_type`.
#define FOREVERY_O(DO) \
FOREVERY_FIXED_O(DO) \
DO(0, index_type)

// These are not just shorthands but indicate the particular
// implementation used (e.g., as opposed to C99's `complex double`,
// or MLIR's `ComplexType`).
using complex64 = std::complex<double>;
using complex32 = std::complex<float>;

/// Encoding of the elemental type, for "overloading" @newSparseTensor.
enum class PrimaryType : uint32_t {
kF64 = 1,
kF32 = 2,
kF16 = 3,
kBF16 = 4,
kI64 = 5,
kI32 = 6,
kI16 = 7,
kI8 = 8,
kC64 = 9,
kC32 = 10
};

// This x-macro includes all `V` types.
#define FOREVERY_V(DO) \
DO(F64, double) \
DO(F32, float) \
DO(F16, f16) \
DO(BF16, bf16) \
DO(I64, int64_t) \
DO(I32, int32_t) \
DO(I16, int16_t) \
DO(I8, int8_t) \
DO(C64, complex64) \
DO(C32, complex32)

/// The actions performed by @newSparseTensor.
enum class Action : uint32_t {
kEmpty = 0,
kFromFile = 1,
kFromCOO = 2,
kSparseToSparse = 3,
kEmptyCOO = 4,
kToCOO = 5,
kToIterator = 6,
};

/// This enum mimics `SparseTensorEncodingAttr::DimLevelType` for
/// breaking dependency cycles. `SparseTensorEncodingAttr::DimLevelType`
/// is the source of truth and this enum should be kept consistent with it.
enum class DimLevelType : uint8_t {
kDense = 0,
kCompressed = 1,
kCompressedNu = 2,
kCompressedNo = 3,
kCompressedNuNo = 4,
kSingleton = 5,
kSingletonNu = 6,
kSingletonNo = 7,
kSingletonNuNo = 8,
};

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// Public functions which operate on MLIR buffers (memrefs) to interact		// Public functions which operate on MLIR buffers (memrefs) to interact
// with sparse tensors (which are only visible as opaque pointers externally).		// with sparse tensors (which are only visible as opaque pointers externally).
// Because these functions deal with memrefs, they should only be used		// Because these functions deal with memrefs, they should only be used
// by MLIR compiler-generated code (or code similarly guaranteed to remain		// by MLIR compiler-generated code (or code similarly guaranteed to remain
// in sync with MLIR; e.g., internal development tools like benchmarks).		// in sync with MLIR; e.g., internal development tools like benchmarks).
▲ Show 20 Lines • Show All 101 Lines • ▼ Show 20 Lines
/// Coordinate-scheme method to write to file in extended FROSTT format.		/// Coordinate-scheme method to write to file in extended FROSTT format.
#define DECL_OUTSPARSETENSOR(VNAME, V) \		#define DECL_OUTSPARSETENSOR(VNAME, V) \
MLIR_CRUNNERUTILS_EXPORT void outSparseTensor##VNAME(void coo, void dest, \		MLIR_CRUNNERUTILS_EXPORT void outSparseTensor##VNAME(void coo, void dest, \
bool sort);		bool sort);
FOREVERY_V(DECL_OUTSPARSETENSOR)		FOREVERY_V(DECL_OUTSPARSETENSOR)
#undef DECL_OUTSPARSETENSOR		#undef DECL_OUTSPARSETENSOR

/// Releases the memory for the tensor-storage object.		/// Releases the memory for the tensor-storage object.
void delSparseTensor(void *tensor);		MLIR_CRUNNERUTILS_EXPORT void delSparseTensor(void *tensor);

/// Releases the memory for the coordinate-scheme object.		/// Releases the memory for the coordinate-scheme object.
#define DECL_DELCOO(VNAME, V) \		#define DECL_DELCOO(VNAME, V) \
MLIR_CRUNNERUTILS_EXPORT void delSparseTensorCOO##VNAME(void *coo);		MLIR_CRUNNERUTILS_EXPORT void delSparseTensorCOO##VNAME(void *coo);
FOREVERY_V(DECL_DELCOO)		FOREVERY_V(DECL_DELCOO)
#undef DECL_DELCOO		#undef DECL_DELCOO

/// Helper function to read a sparse tensor filename from the environment,		/// Helper function to read a sparse tensor filename from the environment,
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	#define DECL_CONVERTFROMMLIRSPARSETENSOR(VNAME, V) \
MLIR_CRUNNERUTILS_EXPORT void convertFromMLIRSparseTensor##VNAME( \		MLIR_CRUNNERUTILS_EXPORT void convertFromMLIRSparseTensor##VNAME( \
void tensor, uint64_t pRank, uint64_t pNse, uint64_t *pShape, \		void tensor, uint64_t pRank, uint64_t pNse, uint64_t *pShape, \
V pValues, uint64_t pIndices);		V pValues, uint64_t pIndices);
FOREVERY_V(DECL_CONVERTFROMMLIRSPARSETENSOR)		FOREVERY_V(DECL_CONVERTFROMMLIRSPARSETENSOR)
#undef DECL_CONVERTFROMMLIRSPARSETENSOR		#undef DECL_CONVERTFROMMLIRSPARSETENSOR

} // extern "C"		} // extern "C"

#endif // MLIR_EXECUTIONENGINE_SPARSETENSORUTILS_H_		#endif // MLIR_EXECUTIONENGINE_SPARSETENSORUTILS_H
		aartbikUnsubmitted Done Reply Inline Actions empty line before closing #endif aartbik: empty line before closing #endif

mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h

	Show All 10 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef MLIR_DIALECT_SPARSETENSOR_TRANSFORMS_CODEGENUTILS_H_			#ifndef MLIR_DIALECT_SPARSETENSOR_TRANSFORMS_CODEGENUTILS_H_
	#define MLIR_DIALECT_SPARSETENSOR_TRANSFORMS_CODEGENUTILS_H_			#define MLIR_DIALECT_SPARSETENSOR_TRANSFORMS_CODEGENUTILS_H_

	#include "mlir/Dialect/Arith/IR/Arith.h"			#include "mlir/Dialect/Arith/IR/Arith.h"
	#include "mlir/Dialect/Complex/IR/Complex.h"			#include "mlir/Dialect/Complex/IR/Complex.h"
	#include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"			#include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"
	#include "mlir/ExecutionEngine/SparseTensorUtils.h"			#include "mlir/ExecutionEngine/SparseTensor/Enums.h"
	#include "mlir/IR/Builders.h"			#include "mlir/IR/Builders.h"

	namespace mlir {			namespace mlir {
	class Location;			class Location;
	class Type;			class Type;
	class Value;			class Value;

	namespace sparse_tensor {			namespace sparse_tensor {
	▲ Show 20 Lines • Show All 172 Lines • Show Last 20 Lines

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp

	Show All 22 Lines
	#include "mlir/Dialect/Func/IR/FuncOps.h"			#include "mlir/Dialect/Func/IR/FuncOps.h"
	#include "mlir/Dialect/LLVMIR/LLVMDialect.h"			#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
	#include "mlir/Dialect/Linalg/Utils/Utils.h"			#include "mlir/Dialect/Linalg/Utils/Utils.h"
	#include "mlir/Dialect/MemRef/IR/MemRef.h"			#include "mlir/Dialect/MemRef/IR/MemRef.h"
	#include "mlir/Dialect/SCF/IR/SCF.h"			#include "mlir/Dialect/SCF/IR/SCF.h"
	#include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"			#include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"
	#include "mlir/Dialect/SparseTensor/Transforms/Passes.h"			#include "mlir/Dialect/SparseTensor/Transforms/Passes.h"
	#include "mlir/Dialect/Tensor/IR/Tensor.h"			#include "mlir/Dialect/Tensor/IR/Tensor.h"
	#include "mlir/ExecutionEngine/SparseTensorUtils.h"			#include "mlir/ExecutionEngine/SparseTensor/Enums.h"
	#include "mlir/Transforms/DialectConversion.h"			#include "mlir/Transforms/DialectConversion.h"

	using namespace mlir;			using namespace mlir;
	using namespace mlir::sparse_tensor;			using namespace mlir::sparse_tensor;

	namespace {			namespace {

	/// Shorthand aliases for the `emitCInterface` argument to `getFunc()`,			/// Shorthand aliases for the `emitCInterface` argument to `getFunc()`,
	▲ Show 20 Lines • Show All 1,476 Lines • Show Last 20 Lines

mlir/lib/ExecutionEngine/CMakeLists.txt

Show First 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	add_mlir_library(MLIRJitRunner
MLIRIR		MLIRIR
MLIRParser		MLIRParser
MLIRLLVMToLLVMIRTranslation		MLIRLLVMToLLVMIRTranslation
MLIRTargetLLVMIRExport		MLIRTargetLLVMIRExport
MLIRTransforms		MLIRTransforms
MLIRSupport		MLIRSupport
)		)

		add_mlir_library(mlir_float16_utils
		SHARED
		Float16bits.cpp

		EXCLUDE_FROM_LIBMLIR
		)
		set_property(TARGET mlir_float16_utils PROPERTY CXX_STANDARD 17)
		target_compile_definitions(mlir_float16_utils PRIVATE mlir_float16_utils_EXPORTS)

		add_subdirectory(SparseTensor)

add_mlir_library(mlir_c_runner_utils		add_mlir_library(mlir_c_runner_utils
SHARED		SHARED
CRunnerUtils.cpp		CRunnerUtils.cpp
Float16bits.cpp
SparseTensorUtils.cpp		SparseTensorUtils.cpp

EXCLUDE_FROM_LIBMLIR		EXCLUDE_FROM_LIBMLIR

		LINK_LIBS PUBLIC
		mlir_sparse_tensor_utils
)		)
set_property(TARGET mlir_c_runner_utils PROPERTY CXX_STANDARD 17)		set_property(TARGET mlir_c_runner_utils PROPERTY CXX_STANDARD 17)
target_compile_definitions(mlir_c_runner_utils PRIVATE mlir_c_runner_utils_EXPORTS)		target_compile_definitions(mlir_c_runner_utils PRIVATE mlir_c_runner_utils_EXPORTS)

add_mlir_library(mlir_runner_utils		add_mlir_library(mlir_runner_utils
SHARED		SHARED
RunnerUtils.cpp		RunnerUtils.cpp

▲ Show 20 Lines • Show All 123 Lines • Show Last 20 Lines

mlir/lib/ExecutionEngine/Float16bits.cpp

	//===--- Float16bits.cpp - supports 2-byte floats ------------------------===//			//===--- Float16bits.cpp - supports 2-byte floats ------------------------===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// This file implements f16 and bf16 to support the compilation and execution			// This file implements f16 and bf16 to support the compilation and execution
	// of programs using these types.			// of programs using these types.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "mlir/ExecutionEngine/Float16bits.h"			#include "mlir/ExecutionEngine/Float16bits.h"

				#ifdef MLIR_FLOAT16_DEFINE_FUNCTIONS // We are building this library

	#include <cmath>			#include <cmath>
	#include <cstring>			#include <cstring>

	namespace {			namespace {

	// Union used to make the int/float aliasing explicit so we can access the raw			// Union used to make the int/float aliasing explicit so we can access the raw
	// bits.			// bits.
	union Float32Bits {			union Float32Bits {
	▲ Show 20 Lines • Show All 160 Lines • ▼ Show 20 Lines

	// Provide a double->bfloat conversion routine in case the runtime doesn't have			// Provide a double->bfloat conversion routine in case the runtime doesn't have
	// one.			// one.
	extern "C" BF16ABIType ATTR_WEAK __truncdfbf2(double d) {			extern "C" BF16ABIType ATTR_WEAK __truncdfbf2(double d) {
	// This does a double rounding step, but it's precise enough for our use			// This does a double rounding step, but it's precise enough for our use
	// cases.			// cases.
	return __truncsfbf2(static_cast<float>(d));			return __truncsfbf2(static_cast<float>(d));
	}			}

				#endif // MLIR_FLOAT16_DEFINE_FUNCTIONS

mlir/lib/ExecutionEngine/SparseTensor/CMakeLists.txt

This file was added.

				# Unlike mlir_float16_utils, mlir_c_runner_utils, etc, we do not make
				# this a shared library: because doing so causes issues building on Windows.
				add_mlir_library(mlir_sparse_tensor_utils
				File.cpp
				NNZ.cpp
				Storage.cpp

				EXCLUDE_FROM_LIBMLIR

				LINK_LIBS PUBLIC
				mlir_float16_utils
				)
				set_property(TARGET mlir_sparse_tensor_utils PROPERTY CXX_STANDARD 17)
				target_compile_definitions(mlir_sparse_tensor_utils PRIVATE mlir_sparse_tensor_utils_EXPORTS)

				# To make sure we adhere to the style guide:
				# <https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers>
				check_cxx_compiler_flag(-Wweak-vtables
				COMPILER_SUPPORTS_WARNING_WEAK_VTABLES)
				if(COMPILER_SUPPORTS_WARNING_WEAK_VTABLES)
				target_compile_options(mlir_sparse_tensor_utils PUBLIC
				"-Wweak-vtables")
				endif()

mlir/lib/ExecutionEngine/SparseTensor/File.cpp

This file was added.

				//===- File.cpp - Parsing sparse tensors from files -----------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements parsing and printing of files in one of the
				// following external formats:
				//
				// (1) Matrix Market Exchange (MME): *.mtx
				// https://math.nist.gov/MatrixMarket/formats.html
				//
				// (2) Formidable Repository of Open Sparse Tensors and Tools (FROSTT): *.tns
				// http://frostt.io/tensors/file-formats.html
				//
				// This file is part of the lightweight runtime support library for sparse
				// tensor manipulations. The functionality of the support library is meant
				// to simplify benchmarking, testing, and debugging MLIR code operating on
				// sparse tensors. However, the provided functionality is not part of
				// core MLIR itself.
				//
				//===----------------------------------------------------------------------===//

				#include "mlir/ExecutionEngine/SparseTensor/File.h"

				#ifdef MLIR_SPARSETENSOR_DEFINE_FUNCTIONS // We are building this library

				#include <cctype>
				#include <cstring>

				pgavinUnsubmitted Done Reply Inline Actions Why not enclose the definitions below in the namespace? pgavin: Why not enclose the definitions below in the namespace?
				wrengrAuthorUnsubmitted Done Reply Inline Actions It's LLVM/MLIR style and guards against future bugs wrengr: It's [[ https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement…
				using namespace mlir::sparse_tensor;

				/// Opens the file for reading.
				void SparseTensorFile::openFile() {
				if (file)
				MLIR_SPARSETENSOR_FATAL("Already opened file %s\n", filename);
				file = fopen(filename, "r");
				if (!file)
				MLIR_SPARSETENSOR_FATAL("Cannot find file %s\n", filename);
				}

				/// Closes the file.
				void SparseTensorFile::closeFile() {
				if (file) {
				fclose(file);
				file = nullptr;
				}
				}

				// TODO(wrengr/bixia): figure out how to reorganize the element-parsing
				// loop of `openSparseTensorCOO` into methods of this class, so we can
				// avoid leaking access to the `line` pointer (both for general hygiene
				// and because we can't mark it const due to the second argument of
				// `strtoul`/`strtoud` being `char * *restrict` rather than
				// `char const* *restrict`).
				//
				/// Attempts to read a line from the file.
				char *SparseTensorFile::readLine() {
				if (fgets(line, kColWidth, file))
				return line;
				MLIR_SPARSETENSOR_FATAL("Cannot read next line of %s\n", filename);
				}

				/// Reads and parses the file's header.
				void SparseTensorFile::readHeader() {
				assert(file && "Attempt to readHeader() before openFile()");
				if (strstr(filename, ".mtx"))
				readMMEHeader();
				else if (strstr(filename, ".tns"))
				readExtFROSTTHeader();
				else
				MLIR_SPARSETENSOR_FATAL("Unknown format %s\n", filename);
				assert(isValid() && "Failed to read the header");
				}

				/// Asserts the shape subsumes the actual dimension sizes. Is only
				/// valid after parsing the header.
				void SparseTensorFile::assertMatchesShape(uint64_t rank,
				const uint64_t *shape) const {
				assert(rank == getRank() && "Rank mismatch");
				for (uint64_t r = 0; r < rank; r++)
				assert((shape[r] == 0 \|\| shape[r] == idata[2 + r]) &&
				"Dimension size mismatch");
				}

				/// Helper to convert string to lower case.
				static inline char toLower(char token) {
				for (char c = token; c; ++c)
				c = tolower(c);
				return token;
				}

				/// Read the MME header of a general sparse matrix of type real.
				void SparseTensorFile::readMMEHeader() {
				char header[64];
				char object[64];
				char format[64];
				char field[64];
				char symmetry[64];
				// Read header line.
				if (fscanf(file, "%63s %63s %63s %63s %63s\n", header, object, format, field,
				symmetry) != 5)
				MLIR_SPARSETENSOR_FATAL("Corrupt header in %s\n", filename);
				// Process `field`, which specify pattern or the data type of the values.
				if (strcmp(toLower(field), "pattern") == 0)
				valueKind_ = ValueKind::kPattern;
				else if (strcmp(toLower(field), "real") == 0)
				valueKind_ = ValueKind::kReal;
				else if (strcmp(toLower(field), "integer") == 0)
				valueKind_ = ValueKind::kInteger;
				else if (strcmp(toLower(field), "complex") == 0)
				valueKind_ = ValueKind::kComplex;
				else
				MLIR_SPARSETENSOR_FATAL("Unexpected header field value in %s\n", filename);

				// Set properties.
				isSymmetric_ = (strcmp(toLower(symmetry), "symmetric") == 0);
				// Make sure this is a general sparse matrix.
				if (strcmp(toLower(header), "%%matrixmarket") \|\|
				strcmp(toLower(object), "matrix") \|\|
				strcmp(toLower(format), "coordinate") \|\|
				(strcmp(toLower(symmetry), "general") && !isSymmetric_))
				MLIR_SPARSETENSOR_FATAL("Cannot find a general sparse matrix in %s\n",
				filename);
				// Skip comments.
				while (true) {
				readLine();
				if (line[0] != '%')
				break;
				}
				// Next line contains M N NNZ.
				idata[0] = 2; // rank
				if (sscanf(line, "%" PRIu64 "%" PRIu64 "%" PRIu64 "\n", idata + 2, idata + 3,
				idata + 1) != 3)
				MLIR_SPARSETENSOR_FATAL("Cannot find size in %s\n", filename);
				}

				/// Read the "extended" FROSTT header. Although not part of the documented
				/// format, we assume that the file starts with optional comments followed
				/// by two lines that define the rank, the number of nonzeros, and the
				/// dimensions sizes (one per rank) of the sparse tensor.
				void SparseTensorFile::readExtFROSTTHeader() {
				// Skip comments.
				while (true) {
				readLine();
				if (line[0] != '#')
				break;
				}
				// Next line contains RANK and NNZ.
				if (sscanf(line, "%" PRIu64 "%" PRIu64 "\n", idata, idata + 1) != 2)
				MLIR_SPARSETENSOR_FATAL("Cannot find metadata in %s\n", filename);
				// Followed by a line with the dimension sizes (one per rank).
				for (uint64_t r = 0; r < idata[0]; r++)
				if (fscanf(file, "%" PRIu64, idata + 2 + r) != 1)
				MLIR_SPARSETENSOR_FATAL("Cannot find dimension size %s\n", filename);
				readLine(); // end of line
				// The FROSTT format does not define the data type of the nonzero elements.
				valueKind_ = ValueKind::kUndefined;
				}

				#endif // MLIR_SPARSETENSOR_DEFINE_FUNCTIONS

mlir/lib/ExecutionEngine/SparseTensor/NNZ.cpp

This file was added.

				//===- NNZ.cpp - NNZ-statistics for direct sparse2sparse conversion -------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file contains method definitions for `SparseTensorNNZ`.
				//
				// This file is part of the lightweight runtime support library for sparse
				// tensor manipulations. The functionality of the support library is meant
				// to simplify benchmarking, testing, and debugging MLIR code operating on
				// sparse tensors. However, the provided functionality is not part of
				// core MLIR itself.
				//
				//===----------------------------------------------------------------------===//

				#include "mlir/ExecutionEngine/SparseTensor/Storage.h"

				#ifdef MLIR_SPARSETENSOR_DEFINE_FUNCTIONS // We are building this library

				using namespace mlir::sparse_tensor;

				SparseTensorNNZ::SparseTensorNNZ(const std::vector<uint64_t> &dimSizes,
				const std::vector<DimLevelType> &sparsity)
				: dimSizes(dimSizes), dimTypes(sparsity), nnz(getRank()) {
				assert(dimSizes.size() == dimTypes.size() && "Rank mismatch");
				bool uncompressed = true;
				(void)uncompressed;
				uint64_t sz = 1; // the product of all `dimSizes` strictly less than `r`.
				for (uint64_t rank = getRank(), r = 0; r < rank; r++) {
				switch (dimTypes[r]) {
				case DimLevelType::kCompressed:
				assert(uncompressed &&
				"Multiple compressed layers not currently supported");
				uncompressed = false;
				nnz[r].resize(sz, 0); // Both allocate and zero-initialize.
				break;
				case DimLevelType::kDense:
				assert(uncompressed && "Dense after compressed not currently supported");
				break;
				case DimLevelType::kSingleton:
				// Singleton after Compressed causes no problems for allocating
				// `nnz` nor for the yieldPos loop. This remains true even
				// when adding support for multiple compressed dimensions or
				// for dense-after-compressed.
				break;
				default:
				MLIR_SPARSETENSOR_FATAL("unsupported dimension level type");
				}
				sz = detail::checkedMul(sz, dimSizes[r]);
				}
				}

				void SparseTensorNNZ::forallIndices(uint64_t stopDim,
				SparseTensorNNZ::NNZConsumer yield) const {
				assert(stopDim < getRank() && "Stopping-dimension is out of bounds");
				assert(dimTypes[stopDim] == DimLevelType::kCompressed &&
				"Cannot look up non-compressed dimensions");
				forallIndices(yield, stopDim, 0, 0);
				}

				void SparseTensorNNZ::add(const std::vector<uint64_t> &ind) {
				uint64_t parentPos = 0;
				for (uint64_t rank = getRank(), r = 0; r < rank; r++) {
				if (dimTypes[r] == DimLevelType::kCompressed)
				nnz[r][parentPos]++;
				parentPos = parentPos * dimSizes[r] + ind[r];
				}
				}

				void SparseTensorNNZ::forallIndices(SparseTensorNNZ::NNZConsumer yield,
				uint64_t stopDim, uint64_t parentPos,
				uint64_t d) const {
				assert(d <= stopDim);
				if (d == stopDim) {
				assert(parentPos < nnz[d].size() && "Cursor is out of range");
				yield(nnz[d][parentPos]);
				} else {
				const uint64_t sz = dimSizes[d];
				const uint64_t pstart = parentPos * sz;
				for (uint64_t i = 0; i < sz; i++)
				forallIndices(yield, stopDim, pstart + i, d + 1);
				}
				}

				#endif // MLIR_SPARSETENSOR_DEFINE_FUNCTIONS

mlir/lib/ExecutionEngine/SparseTensor/Storage.cpp

This file was added.

				//===- StorageBase.cpp - TACO-flavored sparse tensor representation -------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file contains method definitions for `SparseTensorStorageBase`.
				// In particular we want to ensure that the default implementations of
				// the "partial method specialization" trick aren't inline (since there's
				// no benefit). Though this also helps ensure that we avoid weak-vtables:
				// <https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers>
				//
				// This file is part of the lightweight runtime support library for sparse
				// tensor manipulations. The functionality of the support library is meant
				// to simplify benchmarking, testing, and debugging MLIR code operating on
				// sparse tensors. However, the provided functionality is not part of
				// core MLIR itself.
				//
				//===----------------------------------------------------------------------===//

				#include "mlir/ExecutionEngine/SparseTensor/Storage.h"

				#ifdef MLIR_SPARSETENSOR_DEFINE_FUNCTIONS // We are building this library

				using namespace mlir::sparse_tensor;

				SparseTensorStorageBase::SparseTensorStorageBase(
				const std::vector<uint64_t> &dimSizes, const uint64_t *perm,
				const DimLevelType *sparsity)
				: dimSizes(dimSizes), rev(getRank()),
				dimTypes(sparsity, sparsity + getRank()) {
				assert(perm && sparsity);
				const uint64_t rank = getRank();
				// Validate parameters.
				assert(rank > 0 && "Trivial shape is unsupported");
				for (uint64_t r = 0; r < rank; r++) {
				assert(dimSizes[r] > 0 && "Dimension size zero has trivial storage");
				assert((isDenseDim(r) \|\| isCompressedDim(r) \|\| isSingletonDim(r)) &&
				"Unsupported DimLevelType");
				}
				// Construct the "reverse" (i.e., inverse) permutation.
				for (uint64_t r = 0; r < rank; r++)
				rev[perm[r]] = r;
				}

				// Helper macro for generating error messages when some
				// `SparseTensorStorage<P,I,V>` is cast to `SparseTensorStorageBase`
				// and then the wrong "partial method specialization" is called.
				#define FATAL_PIV(NAME) \
				MLIR_SPARSETENSOR_FATAL("<P,I,V> type mismatch for: " #NAME);

				#define IMPL_NEWENUMERATOR(VNAME, V) \
				void SparseTensorStorageBase::newEnumerator( \
				SparseTensorEnumeratorBase<V> *, uint64_t, const uint64_t ) const { \
				FATAL_PIV("newEnumerator" #VNAME); \
				}
				FOREVERY_V(IMPL_NEWENUMERATOR)
				#undef IMPL_NEWENUMERATOR

				#define IMPL_GETPOINTERS(PNAME, P) \
				void SparseTensorStorageBase::getPointers(std::vector<P> **, uint64_t) { \
				FATAL_PIV("getPointers" #PNAME); \
				}
				FOREVERY_FIXED_O(IMPL_GETPOINTERS)
				#undef IMPL_GETPOINTERS

				#define IMPL_GETINDICES(INAME, I) \
				void SparseTensorStorageBase::getIndices(std::vector<I> **, uint64_t) { \
				FATAL_PIV("getIndices" #INAME); \
				}
				FOREVERY_FIXED_O(IMPL_GETINDICES)
				#undef IMPL_GETINDICES

				#define IMPL_GETVALUES(VNAME, V) \
				void SparseTensorStorageBase::getValues(std::vector<V> **) { \
				FATAL_PIV("getValues" #VNAME); \
				}
				FOREVERY_V(IMPL_GETVALUES)
				#undef IMPL_GETVALUES

				#define IMPL_LEXINSERT(VNAME, V) \
				void SparseTensorStorageBase::lexInsert(const uint64_t *, V) { \
				FATAL_PIV("lexInsert" #VNAME); \
				}
				FOREVERY_V(IMPL_LEXINSERT)
				#undef IMPL_LEXINSERT

				#define IMPL_EXPINSERT(VNAME, V) \
				void SparseTensorStorageBase::expInsert(uint64_t , V , bool , uint64_t , \
				uint64_t) { \
				FATAL_PIV("expInsert" #VNAME); \
				}
				FOREVERY_V(IMPL_EXPINSERT)
				#undef IMPL_EXPINSERT

				#undef FATAL_PIV

				#endif // MLIR_SPARSETENSOR_DEFINE_FUNCTIONS

mlir/lib/ExecutionEngine/SparseTensorUtils.cpp

Show All 9 Lines
// for sparse tensor manipulations. The functionality provided in this library		// for sparse tensor manipulations. The functionality provided in this library
// is meant to simplify benchmarking, testing, and debugging MLIR code that		// is meant to simplify benchmarking, testing, and debugging MLIR code that
// operates on sparse tensors. The provided functionality is not part		// operates on sparse tensors. The provided functionality is not part
// of core MLIR, however.		// of core MLIR, however.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "mlir/ExecutionEngine/SparseTensorUtils.h"		#include "mlir/ExecutionEngine/SparseTensorUtils.h"
		#include "mlir/ExecutionEngine/SparseTensor/COO.h"
		#include "mlir/ExecutionEngine/SparseTensor/ErrorHandling.h"
		#include "mlir/ExecutionEngine/SparseTensor/File.h"
		#include "mlir/ExecutionEngine/SparseTensor/Storage.h"

#ifdef MLIR_CRUNNERUTILS_DEFINE_FUNCTIONS		#ifdef MLIR_CRUNNERUTILS_DEFINE_FUNCTIONS

#include <algorithm>		#include <algorithm>
#include <cassert>		#include <cassert>
#include <cctype>		#include <cctype>
#include <cstdio>		#include <cstdio>
#include <cstdlib>		#include <cstdlib>
#include <cstring>		#include <cstring>
#include <fstream>		#include <fstream>
#include <functional>		#include <functional>
#include <iostream>		#include <iostream>
#include <limits>		#include <limits>
#include <numeric>		#include <numeric>

		using namespace mlir::sparse_tensor;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// Internal support for storing and reading sparse tensors.		// Internal support for storing and reading sparse tensors.
//		//
// The following memory-resident sparse storage schemes are supported:		// The following memory-resident sparse storage schemes are supported:
//		//
// (a) A coordinate scheme for temporarily storing and lexicographically		// (a) A coordinate scheme for temporarily storing and lexicographically
// sorting a sparse tensor by index (SparseTensorCOO).		// sorting a sparse tensor by index (SparseTensorCOO).
Show All 22 Lines
//		//
// In both cases (I) and (II), the SparseTensorStorage format is externally		// In both cases (I) and (II), the SparseTensorStorage format is externally
// only visible as an opaque pointer.		// only visible as an opaque pointer.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

namespace {		namespace {

static constexpr int kColWidth = 1025;

/// A version of `operator*` on `uint64_t` which checks for overflows.
static inline uint64_t checkedMul(uint64_t lhs, uint64_t rhs) {
assert((lhs == 0 \|\| rhs <= std::numeric_limits<uint64_t>::max() / lhs) &&
"Integer overflow");
return lhs * rhs;
}

// This macro helps minimize repetition of this idiom, as well as ensuring
// we have some additional output indicating where the error is coming from.
// (Since `fprintf` doesn't provide a stacktrace, this helps make it easier
// to track down whether an error is coming from our code vs somewhere else
// in MLIR.)
#define FATAL(...) \
do { \
fprintf(stderr, "SparseTensorUtils: " __VA_ARGS__); \
exit(1); \
} while (0)

// TODO: try to unify this with `SparseTensorFile::assertMatchesShape`
// which is used by `openSparseTensorCOO`. It's easy enough to resolve
// the `std::vector` vs pointer mismatch for `dimSizes`; but it's trickier
// to resolve the presence/absence of `perm` (without introducing extra
// overhead), so perhaps the code duplication is unavoidable.
//
/// Asserts that the `dimSizes` (in target-order) under the `perm` (mapping
/// semantic-order to target-order) are a refinement of the desired `shape`
/// (in semantic-order).
///
/// Precondition: `perm` and `shape` must be valid for `rank`.
static inline void
assertPermutedSizesMatchShape(const std::vector<uint64_t> &dimSizes,
uint64_t rank, const uint64_t *perm,
const uint64_t *shape) {
assert(perm && shape);
assert(rank == dimSizes.size() && "Rank mismatch");
for (uint64_t r = 0; r < rank; r++)
assert((shape[r] == 0 \|\| shape[r] == dimSizes[perm[r]]) &&
"Dimension size mismatch");
}

/// A sparse tensor element in coordinate scheme (value and indices).
/// For example, a rank-1 vector element would look like
/// ({i}, a[i])
/// and a rank-5 tensor element like
/// ({i,j,k,l,m}, a[i,j,k,l,m])
/// We use pointer to a shared index pool rather than e.g. a direct
/// vector since that (1) reduces the per-element memory footprint, and
/// (2) centralizes the memory reservation and (re)allocation to one place.
template <typename V>
struct Element final {
Element(uint64_t *ind, V val) : indices(ind), value(val){};
uint64_t *indices; // pointer into shared index pool
V value;
};

/// The type of callback functions which receive an element. We avoid
/// packaging the coordinates and value together as an `Element` object
/// because this helps keep code somewhat cleaner.
template <typename V>
using ElementConsumer =
const std::function<void(const std::vector<uint64_t> &, V)> &;

/// A memory-resident sparse tensor in coordinate scheme (collection of
/// elements). This data structure is used to read a sparse tensor from
/// any external format into memory and sort the elements lexicographically
/// by indices before passing it back to the client (most packed storage
/// formats require the elements to appear in lexicographic index order).
template <typename V>
struct SparseTensorCOO final {
public:
SparseTensorCOO(const std::vector<uint64_t> &dimSizes, uint64_t capacity)
: dimSizes(dimSizes) {
if (capacity) {
elements.reserve(capacity);
indices.reserve(capacity * getRank());
}
}

/// Adds element as indices and value.
void add(const std::vector<uint64_t> &ind, V val) {
assert(!iteratorLocked && "Attempt to add() after startIterator()");
uint64_t *base = indices.data();
uint64_t size = indices.size();
uint64_t rank = getRank();
assert(ind.size() == rank && "Element rank mismatch");
for (uint64_t r = 0; r < rank; r++) {
assert(ind[r] < dimSizes[r] && "Index is too large for the dimension");
indices.push_back(ind[r]);
}
// This base only changes if indices were reallocated. In that case, we
// need to correct all previous pointers into the vector. Note that this
// only happens if we did not set the initial capacity right, and then only
// for every internal vector reallocation (which with the doubling rule
// should only incur an amortized linear overhead).
uint64_t *newBase = indices.data();
if (newBase != base) {
for (uint64_t i = 0, n = elements.size(); i < n; i++)
elements[i].indices = newBase + (elements[i].indices - base);
base = newBase;
}
// Add element as (pointer into shared index pool, value) pair.
elements.emplace_back(base + size, val);
}

/// Sorts elements lexicographically by index.
void sort() {
assert(!iteratorLocked && "Attempt to sort() after startIterator()");
// TODO: we may want to cache an `isSorted` bit, to avoid
// unnecessary/redundant sorting.
uint64_t rank = getRank();
std::sort(elements.begin(), elements.end(),
[rank](const Element<V> &e1, const Element<V> &e2) {
for (uint64_t r = 0; r < rank; r++) {
if (e1.indices[r] == e2.indices[r])
continue;
return e1.indices[r] < e2.indices[r];
}
return false;
});
}

/// Get the rank of the tensor.
uint64_t getRank() const { return dimSizes.size(); }

/// Getter for the dimension-sizes array.
const std::vector<uint64_t> &getDimSizes() const { return dimSizes; }

/// Getter for the elements array.
const std::vector<Element<V>> &getElements() const { return elements; }

/// Switch into iterator mode.
void startIterator() {
iteratorLocked = true;
iteratorPos = 0;
}

/// Get the next element.
const Element<V> *getNext() {
assert(iteratorLocked && "Attempt to getNext() before startIterator()");
if (iteratorPos < elements.size())
return &(elements[iteratorPos++]);
iteratorLocked = false;
return nullptr;
}

/// Factory method. Permutes the original dimensions according to
/// the given ordering and expects subsequent add() calls to honor
/// that same ordering for the given indices. The result is a
/// fully permuted coordinate scheme.
///
/// Precondition: `dimSizes` and `perm` must be valid for `rank`.
static SparseTensorCOO<V> *newSparseTensorCOO(uint64_t rank,
const uint64_t *dimSizes,
const uint64_t *perm,
uint64_t capacity = 0) {
std::vector<uint64_t> permsz(rank);
for (uint64_t r = 0; r < rank; r++) {
assert(dimSizes[r] > 0 && "Dimension size zero has trivial storage");
permsz[perm[r]] = dimSizes[r];
}
return new SparseTensorCOO<V>(permsz, capacity);
}

private:
const std::vector<uint64_t> dimSizes; // per-dimension sizes
std::vector<Element<V>> elements; // all COO elements
std::vector<uint64_t> indices; // shared index pool
bool iteratorLocked = false;
unsigned iteratorPos = 0;
};

// Forward.
template <typename V>
class SparseTensorEnumeratorBase;

// Helper macro for generating error messages when some
// `SparseTensorStorage<P,I,V>` is cast to `SparseTensorStorageBase`
// and then the wrong "partial method specialization" is called.
#define FATAL_PIV(NAME) FATAL("<P,I,V> type mismatch for: " #NAME);

/// Abstract base class for `SparseTensorStorage<P,I,V>`. This class
/// takes responsibility for all the `<P,I,V>`-independent aspects
/// of the tensor (e.g., shape, sparsity, permutation). In addition,
/// we use function overloading to implement "partial" method
/// specialization, which the C-API relies on to catch type errors
/// arising from our use of opaque pointers.
class SparseTensorStorageBase {
public:
/// Constructs a new storage object. The `perm` maps the tensor's
/// semantic-ordering of dimensions to this object's storage-order.
/// The `dimSizes` and `sparsity` arrays are already in storage-order.
///
/// Precondition: `perm` and `sparsity` must be valid for `dimSizes.size()`.
SparseTensorStorageBase(const std::vector<uint64_t> &dimSizes,
const uint64_t perm, const DimLevelType sparsity)
: dimSizes(dimSizes), rev(getRank()),
dimTypes(sparsity, sparsity + getRank()) {
assert(perm && sparsity);
const uint64_t rank = getRank();
// Validate parameters.
assert(rank > 0 && "Trivial shape is unsupported");
for (uint64_t r = 0; r < rank; r++) {
assert(dimSizes[r] > 0 && "Dimension size zero has trivial storage");
assert((dimTypes[r] == DimLevelType::kDense \|\|
dimTypes[r] == DimLevelType::kCompressed) &&
"Unsupported DimLevelType");
}
// Construct the "reverse" (i.e., inverse) permutation.
for (uint64_t r = 0; r < rank; r++)
rev[perm[r]] = r;
}

virtual ~SparseTensorStorageBase() = default;

/// Get the rank of the tensor.
uint64_t getRank() const { return dimSizes.size(); }

/// Getter for the dimension-sizes array, in storage-order.
const std::vector<uint64_t> &getDimSizes() const { return dimSizes; }

/// Safely lookup the size of the given (storage-order) dimension.
uint64_t getDimSize(uint64_t d) const {
assert(d < getRank());
return dimSizes[d];
}

/// Getter for the "reverse" permutation, which maps this object's
/// storage-order to the tensor's semantic-order.
const std::vector<uint64_t> &getRev() const { return rev; }

/// Getter for the dimension-types array, in storage-order.
const std::vector<DimLevelType> &getDimTypes() const { return dimTypes; }

/// Safely check if the (storage-order) dimension uses compressed storage.
bool isCompressedDim(uint64_t d) const {
assert(d < getRank());
return (dimTypes[d] == DimLevelType::kCompressed);
}

/// Allocate a new enumerator.
#define DECL_NEWENUMERATOR(VNAME, V) \
virtual void newEnumerator(SparseTensorEnumeratorBase<V> **, uint64_t, \
const uint64_t *) const { \
FATAL_PIV("newEnumerator" #VNAME); \
}
FOREVERY_V(DECL_NEWENUMERATOR)
#undef DECL_NEWENUMERATOR

/// Overhead storage.
#define DECL_GETPOINTERS(PNAME, P) \
virtual void getPointers(std::vector<P> **, uint64_t) { \
FATAL_PIV("getPointers" #PNAME); \
}
FOREVERY_FIXED_O(DECL_GETPOINTERS)
#undef DECL_GETPOINTERS
#define DECL_GETINDICES(INAME, I) \
virtual void getIndices(std::vector<I> **, uint64_t) { \
FATAL_PIV("getIndices" #INAME); \
}
FOREVERY_FIXED_O(DECL_GETINDICES)
#undef DECL_GETINDICES

/// Primary storage.
#define DECL_GETVALUES(VNAME, V) \
virtual void getValues(std::vector<V> **) { FATAL_PIV("getValues" #VNAME); }
FOREVERY_V(DECL_GETVALUES)
#undef DECL_GETVALUES

/// Element-wise insertion in lexicographic index order.
#define DECL_LEXINSERT(VNAME, V) \
virtual void lexInsert(const uint64_t *, V) { FATAL_PIV("lexInsert" #VNAME); }
FOREVERY_V(DECL_LEXINSERT)
#undef DECL_LEXINSERT

/// Expanded insertion.
#define DECL_EXPINSERT(VNAME, V) \
virtual void expInsert(uint64_t , V , bool , uint64_t , uint64_t) { \
FATAL_PIV("expInsert" #VNAME); \
}
FOREVERY_V(DECL_EXPINSERT)
#undef DECL_EXPINSERT

/// Finishes insertion.
virtual void endInsert() = 0;

protected:
// Since this class is virtual, we must disallow public copying in
// order to avoid "slicing". Since this class has data members,
// that means making copying protected.
// <https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rc-copy-virtual>
SparseTensorStorageBase(const SparseTensorStorageBase &) = default;
// Copy-assignment would be implicitly deleted (because `dimSizes`
// is const), so we explicitly delete it for clarity.
SparseTensorStorageBase &operator=(const SparseTensorStorageBase &) = delete;

private:
const std::vector<uint64_t> dimSizes;
std::vector<uint64_t> rev;
const std::vector<DimLevelType> dimTypes;
};

#undef FATAL_PIV

// Forward.
template <typename P, typename I, typename V>
class SparseTensorEnumerator;

/// A memory-resident sparse tensor using a storage scheme based on
/// per-dimension sparse/dense annotations. This data structure provides a
/// bufferized form of a sparse tensor type. In contrast to generating setup
/// methods for each differently annotated sparse tensor, this method provides
/// a convenient "one-size-fits-all" solution that simply takes an input tensor
/// and annotations to implement all required setup in a general manner.
template <typename P, typename I, typename V>
class SparseTensorStorage final : public SparseTensorStorageBase {
/// Private constructor to share code between the other constructors.
/// Beware that the object is not necessarily guaranteed to be in a
/// valid state after this constructor alone; e.g., `isCompressedDim(d)`
/// doesn't entail `!(pointers[d].empty())`.
///
/// Precondition: `perm` and `sparsity` must be valid for `dimSizes.size()`.
SparseTensorStorage(const std::vector<uint64_t> &dimSizes,
const uint64_t perm, const DimLevelType sparsity)
: SparseTensorStorageBase(dimSizes, perm, sparsity), pointers(getRank()),
indices(getRank()), idx(getRank()) {}

public:
/// Constructs a sparse tensor storage scheme with the given dimensions,
/// permutation, and per-dimension dense/sparse annotations, using
/// the coordinate scheme tensor for the initial contents if provided.
///
/// Precondition: `perm` and `sparsity` must be valid for `dimSizes.size()`.
SparseTensorStorage(const std::vector<uint64_t> &dimSizes,
const uint64_t perm, const DimLevelType sparsity,
SparseTensorCOO<V> *coo)
: SparseTensorStorage(dimSizes, perm, sparsity) {
// Provide hints on capacity of pointers and indices.
// TODO: needs much fine-tuning based on actual sparsity; currently
// we reserve pointer/index space based on all previous dense
// dimensions, which works well up to first sparse dim; but
// we should really use nnz and dense/sparse distribution.
bool allDense = true;
uint64_t sz = 1;
for (uint64_t r = 0, rank = getRank(); r < rank; r++) {
if (isCompressedDim(r)) {
// TODO: Take a parameter between 1 and `dimSizes[r]`, and multiply
// `sz` by that before reserving. (For now we just use 1.)
pointers[r].reserve(sz + 1);
pointers[r].push_back(0);
indices[r].reserve(sz);
sz = 1;
allDense = false;
} else { // Dense dimension.
sz = checkedMul(sz, getDimSizes()[r]);
}
}
// Then assign contents from coordinate scheme tensor if provided.
if (coo) {
// Ensure both preconditions of `fromCOO`.
assert(coo->getDimSizes() == getDimSizes() && "Tensor size mismatch");
coo->sort();
// Now actually insert the `elements`.
const std::vector<Element<V>> &elements = coo->getElements();
uint64_t nnz = elements.size();
values.reserve(nnz);
fromCOO(elements, 0, nnz, 0);
} else if (allDense) {
values.resize(sz, 0);
}
}

/// Constructs a sparse tensor storage scheme with the given dimensions,
/// permutation, and per-dimension dense/sparse annotations, using
/// the given sparse tensor for the initial contents.
///
/// Preconditions:
/// * `perm` and `sparsity` must be valid for `dimSizes.size()`.
/// * The `tensor` must have the same value type `V`.
SparseTensorStorage(const std::vector<uint64_t> &dimSizes,
const uint64_t perm, const DimLevelType sparsity,
const SparseTensorStorageBase &tensor);

~SparseTensorStorage() final = default;

/// Partially specialize these getter methods based on template types.
void getPointers(std::vector<P> **out, uint64_t d) final {
assert(d < getRank());
*out = &pointers[d];
}
void getIndices(std::vector<I> **out, uint64_t d) final {
assert(d < getRank());
*out = &indices[d];
}
void getValues(std::vector<V> *out) final { out = &values; }

/// Partially specialize lexicographical insertions based on template types.
void lexInsert(const uint64_t *cursor, V val) final {
// First, wrap up pending insertion path.
uint64_t diff = 0;
uint64_t top = 0;
if (!values.empty()) {
diff = lexDiff(cursor);
endPath(diff + 1);
top = idx[diff] + 1;
}
// Then continue with insertion path.
insPath(cursor, diff, top, val);
}

/// Partially specialize expanded insertions based on template types.
/// Note that this method resets the values/filled-switch array back
/// to all-zero/false while only iterating over the nonzero elements.
void expInsert(uint64_t cursor, V values, bool filled, uint64_t added,
uint64_t count) final {
if (count == 0)
return;
// Sort.
std::sort(added, added + count);
// Restore insertion path for first insert.
const uint64_t lastDim = getRank() - 1;
uint64_t index = added[0];
cursor[lastDim] = index;
lexInsert(cursor, values[index]);
assert(filled[index]);
values[index] = 0;
filled[index] = false;
// Subsequent insertions are quick.
for (uint64_t i = 1; i < count; i++) {
assert(index < added[i] && "non-lexicographic insertion");
index = added[i];
cursor[lastDim] = index;
insPath(cursor, lastDim, added[i - 1] + 1, values[index]);
assert(filled[index]);
values[index] = 0;
filled[index] = false;
}
}

/// Finalizes lexicographic insertions.
void endInsert() final {
if (values.empty())
finalizeSegment(0);
else
endPath(0);
}

void newEnumerator(SparseTensorEnumeratorBase<V> **out, uint64_t rank,
const uint64_t *perm) const final {
out = new SparseTensorEnumerator<P, I, V>(this, rank, perm);
}

/// Returns this sparse tensor storage scheme as a new memory-resident
/// sparse tensor in coordinate scheme with the given dimension order.
///
/// Precondition: `perm` must be valid for `getRank()`.
SparseTensorCOO<V> toCOO(const uint64_t perm) const {
SparseTensorEnumeratorBase<V> *enumerator;
newEnumerator(&enumerator, getRank(), perm);
SparseTensorCOO<V> *coo =
new SparseTensorCOO<V>(enumerator->permutedSizes(), values.size());
enumerator->forallElements([&coo](const std::vector<uint64_t> &ind, V val) {
coo->add(ind, val);
});
// TODO: This assertion assumes there are no stored zeros,
// or if there are then that we don't filter them out.
// Cf., <https://github.com/llvm/llvm-project/issues/54179>
assert(coo->getElements().size() == values.size());
delete enumerator;
return coo;
}

/// Factory method. Constructs a sparse tensor storage scheme with the given
/// dimensions, permutation, and per-dimension dense/sparse annotations,
/// using the coordinate scheme tensor for the initial contents if provided.
/// In the latter case, the coordinate scheme must respect the same
/// permutation as is desired for the new sparse tensor storage.
///
/// Precondition: `shape`, `perm`, and `sparsity` must be valid for `rank`.
static SparseTensorStorage<P, I, V> *
newSparseTensor(uint64_t rank, const uint64_t shape, const uint64_t perm,
const DimLevelType sparsity, SparseTensorCOO<V> coo) {
SparseTensorStorage<P, I, V> *n = nullptr;
if (coo) {
const auto &coosz = coo->getDimSizes();
assertPermutedSizesMatchShape(coosz, rank, perm, shape);
n = new SparseTensorStorage<P, I, V>(coosz, perm, sparsity, coo);
} else {
std::vector<uint64_t> permsz(rank);
for (uint64_t r = 0; r < rank; r++) {
assert(shape[r] > 0 && "Dimension size zero has trivial storage");
permsz[perm[r]] = shape[r];
}
// We pass the null `coo` to ensure we select the intended constructor.
n = new SparseTensorStorage<P, I, V>(permsz, perm, sparsity, coo);
}
return n;
}

/// Factory method. Constructs a sparse tensor storage scheme with
/// the given dimensions, permutation, and per-dimension dense/sparse
/// annotations, using the sparse tensor for the initial contents.
///
/// Preconditions:
/// * `shape`, `perm`, and `sparsity` must be valid for `rank`.
/// * The `tensor` must have the same value type `V`.
static SparseTensorStorage<P, I, V> *
newSparseTensor(uint64_t rank, const uint64_t shape, const uint64_t perm,
const DimLevelType *sparsity,
const SparseTensorStorageBase *source) {
assert(source && "Got nullptr for source");
SparseTensorEnumeratorBase<V> *enumerator;
source->newEnumerator(&enumerator, rank, perm);
const auto &permsz = enumerator->permutedSizes();
assertPermutedSizesMatchShape(permsz, rank, perm, shape);
auto *tensor =
new SparseTensorStorage<P, I, V>(permsz, perm, sparsity, *source);
delete enumerator;
return tensor;
}

private:
/// Appends an arbitrary new position to `pointers[d]`. This method
/// checks that `pos` is representable in the `P` type; however, it
/// does not check that `pos` is semantically valid (i.e., larger than
/// the previous position and smaller than `indices[d].capacity()`).
void appendPointer(uint64_t d, uint64_t pos, uint64_t count = 1) {
assert(isCompressedDim(d));
assert(pos <= std::numeric_limits<P>::max() &&
"Pointer value is too large for the P-type");
pointers[d].insert(pointers[d].end(), count, static_cast<P>(pos));
}

/// Appends index `i` to dimension `d`, in the semantically general
/// sense. For non-dense dimensions, that means appending to the
/// `indices[d]` array, checking that `i` is representable in the `I`
/// type; however, we do not verify other semantic requirements (e.g.,
/// that `i` is in bounds for `dimSizes[d]`, and not previously occurring
/// in the same segment). For dense dimensions, this method instead
/// appends the appropriate number of zeros to the `values` array,
/// where `full` is the number of "entries" already written to `values`
/// for this segment (aka one after the highest index previously appended).
void appendIndex(uint64_t d, uint64_t full, uint64_t i) {
if (isCompressedDim(d)) {
assert(i <= std::numeric_limits<I>::max() &&
"Index value is too large for the I-type");
indices[d].push_back(static_cast<I>(i));
} else { // Dense dimension.
assert(i >= full && "Index was already filled");
if (i == full)
return; // Short-circuit, since it'll be a nop.
if (d + 1 == getRank())
values.insert(values.end(), i - full, 0);
else
finalizeSegment(d + 1, 0, i - full);
}
}

/// Writes the given coordinate to `indices[d][pos]`. This method
/// checks that `i` is representable in the `I` type; however, it
/// does not check that `i` is semantically valid (i.e., in bounds
/// for `dimSizes[d]` and not elsewhere occurring in the same segment).
void writeIndex(uint64_t d, uint64_t pos, uint64_t i) {
assert(isCompressedDim(d));
// Subscript assignment to `std::vector` requires that the `pos`-th
// entry has been initialized; thus we must be sure to check `size()`
// here, instead of `capacity()` as would be ideal.
assert(pos < indices[d].size() && "Index position is out of bounds");
assert(i <= std::numeric_limits<I>::max() &&
"Index value is too large for the I-type");
indices[d][pos] = static_cast<I>(i);
}

/// Computes the assembled-size associated with the `d`-th dimension,
/// given the assembled-size associated with the `(d-1)`-th dimension.
/// "Assembled-sizes" correspond to the (nominal) sizes of overhead
/// storage, as opposed to "dimension-sizes" which are the cardinality
/// of coordinates for that dimension.
///
/// Precondition: the `pointers[d]` array must be fully initialized
/// before calling this method.
uint64_t assembledSize(uint64_t parentSz, uint64_t d) const {
if (isCompressedDim(d))
return pointers[d][parentSz];
// else if dense:
return parentSz * getDimSizes()[d];
}

/// Initializes sparse tensor storage scheme from a memory-resident sparse
/// tensor in coordinate scheme. This method prepares the pointers and
/// indices arrays under the given per-dimension dense/sparse annotations.
///
/// Preconditions:
/// (1) the `elements` must be lexicographically sorted.
/// (2) the indices of every element are valid for `dimSizes` (equal rank
/// and pointwise less-than).
void fromCOO(const std::vector<Element<V>> &elements, uint64_t lo,
uint64_t hi, uint64_t d) {
uint64_t rank = getRank();
assert(d <= rank && hi <= elements.size());
// Once dimensions are exhausted, insert the numerical values.
if (d == rank) {
assert(lo < hi);
values.push_back(elements[lo].value);
return;
}
// Visit all elements in this interval.
uint64_t full = 0;
while (lo < hi) { // If `hi` is unchanged, then `lo < elements.size()`.
// Find segment in interval with same index elements in this dimension.
uint64_t i = elements[lo].indices[d];
uint64_t seg = lo + 1;
while (seg < hi && elements[seg].indices[d] == i)
seg++;
// Handle segment in interval for sparse or dense dimension.
appendIndex(d, full, i);
full = i + 1;
fromCOO(elements, lo, seg, d + 1);
// And move on to next segment in interval.
lo = seg;
}
// Finalize the sparse pointer structure at this dimension.
finalizeSegment(d, full);
}

/// Finalize the sparse pointer structure at this dimension.
void finalizeSegment(uint64_t d, uint64_t full = 0, uint64_t count = 1) {
if (count == 0)
return; // Short-circuit, since it'll be a nop.
if (isCompressedDim(d)) {
appendPointer(d, indices[d].size(), count);
} else { // Dense dimension.
const uint64_t sz = getDimSizes()[d];
assert(sz >= full && "Segment is overfull");
count = checkedMul(count, sz - full);
// For dense storage we must enumerate all the remaining coordinates
// in this dimension (i.e., coordinates after the last non-zero
// element), and either fill in their zero values or else recurse
// to finalize some deeper dimension.
if (d + 1 == getRank())
values.insert(values.end(), count, 0);
else
finalizeSegment(d + 1, 0, count);
}
}

/// Wraps up a single insertion path, inner to outer.
void endPath(uint64_t diff) {
uint64_t rank = getRank();
assert(diff <= rank);
for (uint64_t i = 0; i < rank - diff; i++) {
const uint64_t d = rank - i - 1;
finalizeSegment(d, idx[d] + 1);
}
}

/// Continues a single insertion path, outer to inner.
void insPath(const uint64_t *cursor, uint64_t diff, uint64_t top, V val) {
uint64_t rank = getRank();
assert(diff < rank);
for (uint64_t d = diff; d < rank; d++) {
uint64_t i = cursor[d];
appendIndex(d, top, i);
top = 0;
idx[d] = i;
}
values.push_back(val);
}

/// Finds the lexicographic differing dimension.
uint64_t lexDiff(const uint64_t *cursor) const {
for (uint64_t r = 0, rank = getRank(); r < rank; r++)
if (cursor[r] > idx[r])
return r;
else
assert(cursor[r] == idx[r] && "non-lexicographic insertion");
assert(0 && "duplication insertion");
return -1u;
}

// Allow `SparseTensorEnumerator` to access the data-members (to avoid
// the cost of virtual-function dispatch in inner loops), without
// making them public to other client code.
friend class SparseTensorEnumerator<P, I, V>;

std::vector<std::vector<P>> pointers;
std::vector<std::vector<I>> indices;
std::vector<V> values;
std::vector<uint64_t> idx; // index cursor for lexicographic insertion.
};

/// A (higher-order) function object for enumerating the elements of some
/// `SparseTensorStorage` under a permutation. That is, the `forallElements`
/// method encapsulates the loop-nest for enumerating the elements of
/// the source tensor (in whatever order is best for the source tensor),
/// and applies a permutation to the coordinates/indices before handing
/// each element to the callback. A single enumerator object can be
/// freely reused for several calls to `forallElements`, just so long
/// as each call is sequential with respect to one another.
///
/// N.B., this class stores a reference to the `SparseTensorStorageBase`
/// passed to the constructor; thus, objects of this class must not
/// outlive the sparse tensor they depend on.
///
/// Design Note: The reason we define this class instead of simply using
/// `SparseTensorEnumerator<P,I,V>` is because we need to hide/generalize
/// the `<P,I>` template parameters from MLIR client code (to simplify the
/// type parameters used for direct sparse-to-sparse conversion). And the
/// reason we define the `SparseTensorEnumerator<P,I,V>` subclasses rather
/// than simply using this class, is to avoid the cost of virtual-method
/// dispatch within the loop-nest.
template <typename V>
class SparseTensorEnumeratorBase {
public:
/// Constructs an enumerator with the given permutation for mapping
/// the semantic-ordering of dimensions to the desired target-ordering.
///
/// Preconditions:
/// * the `tensor` must have the same `V` value type.
/// * `perm` must be valid for `rank`.
SparseTensorEnumeratorBase(const SparseTensorStorageBase &tensor,
uint64_t rank, const uint64_t *perm)
: src(tensor), permsz(src.getRev().size()), reord(getRank()),
cursor(getRank()) {
assert(perm && "Received nullptr for permutation");
assert(rank == getRank() && "Permutation rank mismatch");
const auto &rev = src.getRev(); // source-order -> semantic-order
const auto &dimSizes = src.getDimSizes(); // in source storage-order
for (uint64_t s = 0; s < rank; s++) { // `s` source storage-order
uint64_t t = perm[rev[s]]; // `t` target-order
reord[s] = t;
permsz[t] = dimSizes[s];
}
}

virtual ~SparseTensorEnumeratorBase() = default;

// We disallow copying to help avoid leaking the `src` reference.
// (In addition to avoiding the problem of slicing.)
SparseTensorEnumeratorBase(const SparseTensorEnumeratorBase &) = delete;
SparseTensorEnumeratorBase &
operator=(const SparseTensorEnumeratorBase &) = delete;

/// Returns the source/target tensor's rank. (The source-rank and
/// target-rank are always equal since we only support permutations.
/// Though once we add support for other dimension mappings, this
/// method will have to be split in two.)
uint64_t getRank() const { return permsz.size(); }

/// Returns the target tensor's dimension sizes.
const std::vector<uint64_t> &permutedSizes() const { return permsz; }

/// Enumerates all elements of the source tensor, permutes their
/// indices, and passes the permuted element to the callback.
/// The callback must not store the cursor reference directly,
/// since this function reuses the storage. Instead, the callback
/// must copy it if they want to keep it.
virtual void forallElements(ElementConsumer<V> yield) = 0;

protected:
const SparseTensorStorageBase &src;
std::vector<uint64_t> permsz; // in target order.
std::vector<uint64_t> reord; // source storage-order -> target order.
std::vector<uint64_t> cursor; // in target order.
};

template <typename P, typename I, typename V>
class SparseTensorEnumerator final : public SparseTensorEnumeratorBase<V> {
using Base = SparseTensorEnumeratorBase<V>;

public:
/// Constructs an enumerator with the given permutation for mapping
/// the semantic-ordering of dimensions to the desired target-ordering.
///
/// Precondition: `perm` must be valid for `rank`.
SparseTensorEnumerator(const SparseTensorStorage<P, I, V> &tensor,
uint64_t rank, const uint64_t *perm)
: Base(tensor, rank, perm) {}

~SparseTensorEnumerator() final = default;

void forallElements(ElementConsumer<V> yield) final {
forallElements(yield, 0, 0);
}

private:
/// The recursive component of the public `forallElements`.
void forallElements(ElementConsumer<V> yield, uint64_t parentPos,
uint64_t d) {
// Recover the `<P,I,V>` type parameters of `src`.
const auto &src =
static_cast<const SparseTensorStorage<P, I, V> &>(this->src);
if (d == Base::getRank()) {
assert(parentPos < src.values.size() &&
"Value position is out of bounds");
// TODO: <https://github.com/llvm/llvm-project/issues/54179>
yield(this->cursor, src.values[parentPos]);
} else if (src.isCompressedDim(d)) {
// Look up the bounds of the `d`-level segment determined by the
// `d-1`-level position `parentPos`.
const std::vector<P> &pointersD = src.pointers[d];
assert(parentPos + 1 < pointersD.size() &&
"Parent pointer position is out of bounds");
const uint64_t pstart = static_cast<uint64_t>(pointersD[parentPos]);
const uint64_t pstop = static_cast<uint64_t>(pointersD[parentPos + 1]);
// Loop-invariant code for looking up the `d`-level coordinates/indices.
const std::vector<I> &indicesD = src.indices[d];
assert(pstop <= indicesD.size() && "Index position is out of bounds");
uint64_t &cursorReordD = this->cursor[this->reord[d]];
for (uint64_t pos = pstart; pos < pstop; pos++) {
cursorReordD = static_cast<uint64_t>(indicesD[pos]);
forallElements(yield, pos, d + 1);
}
} else { // Dense dimension.
const uint64_t sz = src.getDimSizes()[d];
const uint64_t pstart = parentPos * sz;
uint64_t &cursorReordD = this->cursor[this->reord[d]];
for (uint64_t i = 0; i < sz; i++) {
cursorReordD = i;
forallElements(yield, pstart + i, d + 1);
}
}
}
};

/// Statistics regarding the number of nonzero subtensors in
/// a source tensor, for direct sparse=>sparse conversion a la
/// <https://arxiv.org/abs/2001.02609>.
///
/// N.B., this class stores references to the parameters passed to
/// the constructor; thus, objects of this class must not outlive
/// those parameters.
class SparseTensorNNZ final {
public:
/// Allocate the statistics structure for the desired sizes and
/// sparsity (in the target tensor's storage-order). This constructor
/// does not actually populate the statistics, however; for that see
/// `initialize`.
///
/// Precondition: `dimSizes` must not contain zeros.
SparseTensorNNZ(const std::vector<uint64_t> &dimSizes,
const std::vector<DimLevelType> &sparsity)
: dimSizes(dimSizes), dimTypes(sparsity), nnz(getRank()) {
assert(dimSizes.size() == dimTypes.size() && "Rank mismatch");
bool uncompressed = true;
(void)uncompressed;
uint64_t sz = 1; // the product of all `dimSizes` strictly less than `r`.
for (uint64_t rank = getRank(), r = 0; r < rank; r++) {
switch (dimTypes[r]) {
case DimLevelType::kCompressed:
assert(uncompressed &&
"Multiple compressed layers not currently supported");
uncompressed = false;
nnz[r].resize(sz, 0); // Both allocate and zero-initialize.
break;
case DimLevelType::kDense:
assert(uncompressed &&
"Dense after compressed not currently supported");
break;
case DimLevelType::kSingleton:
// Singleton after Compressed causes no problems for allocating
// `nnz` nor for the yieldPos loop. This remains true even
// when adding support for multiple compressed dimensions or
// for dense-after-compressed.
break;
default:
FATAL("unsupported dimension level type");
}
sz = checkedMul(sz, dimSizes[r]);

}
}

// We disallow copying to help avoid leaking the stored references.
SparseTensorNNZ(const SparseTensorNNZ &) = delete;
SparseTensorNNZ &operator=(const SparseTensorNNZ &) = delete;

/// Returns the rank of the target tensor.
uint64_t getRank() const { return dimSizes.size(); }

/// Enumerate the source tensor to fill in the statistics. The
/// enumerator should already incorporate the permutation (from
/// semantic-order to the target storage-order).
template <typename V>
void initialize(SparseTensorEnumeratorBase<V> &enumerator) {
assert(enumerator.getRank() == getRank() && "Tensor rank mismatch");
assert(enumerator.permutedSizes() == dimSizes && "Tensor size mismatch");
enumerator.forallElements(
[this](const std::vector<uint64_t> &ind, V) { add(ind); });
}

/// The type of callback functions which receive an nnz-statistic.
using NNZConsumer = const std::function<void(uint64_t)> &;

/// Lexicographically enumerates all indicies for dimensions strictly
/// less than `stopDim`, and passes their nnz statistic to the callback.
/// Since our use-case only requires the statistic not the coordinates
/// themselves, we do not bother to construct those coordinates.
void forallIndices(uint64_t stopDim, NNZConsumer yield) const {
assert(stopDim < getRank() && "Stopping-dimension is out of bounds");
assert(dimTypes[stopDim] == DimLevelType::kCompressed &&
"Cannot look up non-compressed dimensions");
forallIndices(yield, stopDim, 0, 0);
}

private:
/// Adds a new element (i.e., increment its statistics). We use
/// a method rather than inlining into the lambda in `initialize`,
/// to avoid spurious templating over `V`. And this method is private
/// to avoid needing to re-assert validity of `ind` (which is guaranteed
/// by `forallElements`).
void add(const std::vector<uint64_t> &ind) {
uint64_t parentPos = 0;
for (uint64_t rank = getRank(), r = 0; r < rank; r++) {
if (dimTypes[r] == DimLevelType::kCompressed)
nnz[r][parentPos]++;
parentPos = parentPos * dimSizes[r] + ind[r];
}
}

/// Recursive component of the public `forallIndices`.
void forallIndices(NNZConsumer yield, uint64_t stopDim, uint64_t parentPos,
uint64_t d) const {
assert(d <= stopDim);
if (d == stopDim) {
assert(parentPos < nnz[d].size() && "Cursor is out of range");
yield(nnz[d][parentPos]);
} else {
const uint64_t sz = dimSizes[d];
const uint64_t pstart = parentPos * sz;
for (uint64_t i = 0; i < sz; i++)
forallIndices(yield, stopDim, pstart + i, d + 1);
}
}

// All of these are in the target storage-order.
const std::vector<uint64_t> &dimSizes;
const std::vector<DimLevelType> &dimTypes;
std::vector<std::vector<uint64_t>> nnz;
};

template <typename P, typename I, typename V>
SparseTensorStorage<P, I, V>::SparseTensorStorage(
const std::vector<uint64_t> &dimSizes, const uint64_t *perm,
const DimLevelType *sparsity, const SparseTensorStorageBase &tensor)
: SparseTensorStorage(dimSizes, perm, sparsity) {
SparseTensorEnumeratorBase<V> *enumerator;
tensor.newEnumerator(&enumerator, getRank(), perm);
{
// Initialize the statistics structure.
SparseTensorNNZ nnz(getDimSizes(), getDimTypes());
nnz.initialize(*enumerator);
// Initialize "pointers" overhead (and allocate "indices", "values").
uint64_t parentSz = 1; // assembled-size (not dimension-size) of `r-1`.
for (uint64_t rank = getRank(), r = 0; r < rank; r++) {
if (isCompressedDim(r)) {
pointers[r].reserve(parentSz + 1);
pointers[r].push_back(0);
uint64_t currentPos = 0;
nnz.forallIndices(r, [this, &currentPos, r](uint64_t n) {
currentPos += n;
appendPointer(r, currentPos);
});
assert(pointers[r].size() == parentSz + 1 &&
"Final pointers size doesn't match allocated size");
// That assertion entails `assembledSize(parentSz, r)`
// is now in a valid state. That is, `pointers[r][parentSz]`
// equals the present value of `currentPos`, which is the
// correct assembled-size for `indices[r]`.
}
// Update assembled-size for the next iteration.
parentSz = assembledSize(parentSz, r);
// Ideally we need only `indices[r].reserve(parentSz)`, however
// the `std::vector` implementation forces us to initialize it too.
// That is, in the yieldPos loop we need random-access assignment
// to `indices[r]`; however, `std::vector`'s subscript-assignment
// only allows assigning to already-initialized positions.
if (isCompressedDim(r))
indices[r].resize(parentSz, 0);
}
values.resize(parentSz, 0); // Both allocate and zero-initialize.
}
// The yieldPos loop
enumerator->forallElements([this](const std::vector<uint64_t> &ind, V val) {
uint64_t parentSz = 1, parentPos = 0;
for (uint64_t rank = getRank(), r = 0; r < rank; r++) {
if (isCompressedDim(r)) {
// If `parentPos == parentSz` then it's valid as an array-lookup;
// however, it's semantically invalid here since that entry
// does not represent a segment of `indices[r]`. Moreover, that
// entry must be immutable for `assembledSize` to remain valid.
assert(parentPos < parentSz && "Pointers position is out of bounds");
const uint64_t currentPos = pointers[r][parentPos];
// This increment won't overflow the `P` type, since it can't
// exceed the original value of `pointers[r][parentPos+1]`
// which was already verified to be within bounds for `P`
// when it was written to the array.
pointers[r][parentPos]++;
writeIndex(r, currentPos, ind[r]);
parentPos = currentPos;
} else { // Dense dimension.
parentPos = parentPos * getDimSizes()[r] + ind[r];
}
parentSz = assembledSize(parentSz, r);
}
assert(parentPos < values.size() && "Value position is out of bounds");
values[parentPos] = val;
});
// No longer need the enumerator, so we'll delete it ASAP.
delete enumerator;
// The finalizeYieldPos loop
for (uint64_t parentSz = 1, rank = getRank(), r = 0; r < rank; r++) {
if (isCompressedDim(r)) {
assert(parentSz == pointers[r].size() - 1 &&
"Actual pointers size doesn't match the expected size");
// Can't check all of them, but at least we can check the last one.
assert(pointers[r][parentSz - 1] == pointers[r][parentSz] &&
"Pointers got corrupted");
// TODO: optimize this by using `memmove` or similar.
for (uint64_t n = 0; n < parentSz; n++) {
const uint64_t parentPos = parentSz - n;
pointers[r][parentPos] = pointers[r][parentPos - 1];
}
pointers[r][0] = 0;
}
parentSz = assembledSize(parentSz, r);
}
}

/// Helper to convert string to lower case.
static char toLower(char token) {
for (char c = token; c; c++)
c = tolower(c);
return token;
}

/// This class abstracts over the information stored in file headers,
/// as well as providing the buffers and methods for parsing those headers.
class SparseTensorFile final {
public:
enum class ValueKind {
kInvalid = 0,
kPattern = 1,
kReal = 2,
kInteger = 3,
kComplex = 4,
kUndefined = 5
};

explicit SparseTensorFile(char *filename) : filename(filename) {
assert(filename && "Received nullptr for filename");
}

// Disallows copying, to avoid duplicating the `file` pointer.
SparseTensorFile(const SparseTensorFile &) = delete;
SparseTensorFile &operator=(const SparseTensorFile &) = delete;

// This dtor tries to avoid leaking the `file`. (Though it's better
// to call `closeFile` explicitly when possible, since there are
// circumstances where dtors are not called reliably.)
~SparseTensorFile() { closeFile(); }

/// Opens the file for reading.
void openFile() {
if (file)
FATAL("Already opened file %s\n", filename);
file = fopen(filename, "r");
if (!file)
FATAL("Cannot find file %s\n", filename);
}

/// Closes the file.
void closeFile() {
if (file) {
fclose(file);
file = nullptr;
}
}

// TODO(wrengr/bixia): figure out how to reorganize the element-parsing
// loop of `openSparseTensorCOO` into methods of this class, so we can
// avoid leaking access to the `line` pointer (both for general hygiene
// and because we can't mark it const due to the second argument of
// `strtoul`/`strtoud` being `char * *restrict` rather than
// `char const* *restrict`).
//
/// Attempts to read a line from the file.
char *readLine() {
if (fgets(line, kColWidth, file))
return line;
FATAL("Cannot read next line of %s\n", filename);
}

/// Reads and parses the file's header.
void readHeader() {
assert(file && "Attempt to readHeader() before openFile()");
if (strstr(filename, ".mtx"))
readMMEHeader();
else if (strstr(filename, ".tns"))
readExtFROSTTHeader();
else
FATAL("Unknown format %s\n", filename);
assert(isValid() && "Failed to read the header");
}

ValueKind getValueKind() const { return valueKind_; }

bool isValid() const { return valueKind_ != ValueKind::kInvalid; }

/// Gets the MME "pattern" property setting. Is only valid after
/// parsing the header.
bool isPattern() const {
assert(isValid() && "Attempt to isPattern() before readHeader()");
return valueKind_ == ValueKind::kPattern;
}

/// Gets the MME "symmetric" property setting. Is only valid after
/// parsing the header.
bool isSymmetric() const {
assert(isValid() && "Attempt to isSymmetric() before readHeader()");
return isSymmetric_;
}

/// Gets the rank of the tensor. Is only valid after parsing the header.
uint64_t getRank() const {
assert(isValid() && "Attempt to getRank() before readHeader()");
return idata[0];
}

/// Gets the number of non-zeros. Is only valid after parsing the header.
uint64_t getNNZ() const {
assert(isValid() && "Attempt to getNNZ() before readHeader()");
return idata[1];
}

/// Gets the dimension-sizes array. The pointer itself is always
/// valid; however, the values stored therein are only valid after
/// parsing the header.
const uint64_t *getDimSizes() const { return idata + 2; }

/// Safely gets the size of the given dimension. Is only valid
/// after parsing the header.
uint64_t getDimSize(uint64_t d) const {
assert(d < getRank());
return idata[2 + d];
}

/// Asserts the shape subsumes the actual dimension sizes. Is only
/// valid after parsing the header.
void assertMatchesShape(uint64_t rank, const uint64_t *shape) const {
assert(rank == getRank() && "Rank mismatch");
for (uint64_t r = 0; r < rank; r++)
assert((shape[r] == 0 \|\| shape[r] == idata[2 + r]) &&
"Dimension size mismatch");
}

private:
void readMMEHeader();
void readExtFROSTTHeader();

const char *filename;
FILE *file = nullptr;
ValueKind valueKind_ = ValueKind::kInvalid;
bool isSymmetric_ = false;
uint64_t idata[512];
char line[kColWidth];
};

/// Read the MME header of a general sparse matrix of type real.
void SparseTensorFile::readMMEHeader() {
char header[64];
char object[64];
char format[64];
char field[64];
char symmetry[64];
// Read header line.
if (fscanf(file, "%63s %63s %63s %63s %63s\n", header, object, format, field,
symmetry) != 5)
FATAL("Corrupt header in %s\n", filename);
// Process `field`, which specify pattern or the data type of the values.
if (strcmp(toLower(field), "pattern") == 0)
valueKind_ = ValueKind::kPattern;
else if (strcmp(toLower(field), "real") == 0)
valueKind_ = ValueKind::kReal;
else if (strcmp(toLower(field), "integer") == 0)
valueKind_ = ValueKind::kInteger;
else if (strcmp(toLower(field), "complex") == 0)
valueKind_ = ValueKind::kComplex;
else
FATAL("Unexpected header field value in %s\n", filename);

// Set properties.
isSymmetric_ = (strcmp(toLower(symmetry), "symmetric") == 0);
// Make sure this is a general sparse matrix.
if (strcmp(toLower(header), "%%matrixmarket") \|\|
strcmp(toLower(object), "matrix") \|\|
strcmp(toLower(format), "coordinate") \|\|
(strcmp(toLower(symmetry), "general") && !isSymmetric_))
FATAL("Cannot find a general sparse matrix in %s\n", filename);
// Skip comments.
while (true) {
readLine();
if (line[0] != '%')
break;
}
// Next line contains M N NNZ.
idata[0] = 2; // rank
if (sscanf(line, "%" PRIu64 "%" PRIu64 "%" PRIu64 "\n", idata + 2, idata + 3,
idata + 1) != 3)
FATAL("Cannot find size in %s\n", filename);
}

/// Read the "extended" FROSTT header. Although not part of the documented
/// format, we assume that the file starts with optional comments followed
/// by two lines that define the rank, the number of nonzeros, and the
/// dimensions sizes (one per rank) of the sparse tensor.
void SparseTensorFile::readExtFROSTTHeader() {
// Skip comments.
while (true) {
readLine();
if (line[0] != '#')
break;
}
// Next line contains RANK and NNZ.
if (sscanf(line, "%" PRIu64 "%" PRIu64 "\n", idata, idata + 1) != 2)
FATAL("Cannot find metadata in %s\n", filename);
// Followed by a line with the dimension sizes (one per rank).
for (uint64_t r = 0; r < idata[0]; r++)
if (fscanf(file, "%" PRIu64, idata + 2 + r) != 1)
FATAL("Cannot find dimension size %s\n", filename);
readLine(); // end of line
// The FROSTT format does not define the data type of the nonzero elements.
valueKind_ = ValueKind::kUndefined;
}

// Adds a value to a tensor in coordinate scheme. If is_symmetric_value is true,
// also adds the value to its symmetric location.
template <typename T, typename V>
static inline void addValue(T *coo, V value,
const std::vector<uint64_t> indices,
bool is_symmetric_value) {
// TODO: <https://github.com/llvm/llvm-project/issues/54179>
coo->add(indices, value);
// We currently chose to deal with symmetric matrices by fully constructing
// them. In the future, we may want to make symmetry implicit for storage
// reasons.
if (is_symmetric_value)
coo->add({indices[1], indices[0]}, value);
}

// Reads an element of a complex type for the current indices in coordinate
// scheme.
template <typename V>
static inline void readCOOValue(SparseTensorCOO<std::complex<V>> *coo,
const std::vector<uint64_t> indices,
char **linePtr, bool is_pattern,
bool add_symmetric_value) {
// Read two values to make a complex. The external formats always store
// numerical values with the type double, but we cast these values to the
// sparse tensor object type. For a pattern tensor, we arbitrarily pick the
// value 1 for all entries.
V re = is_pattern ? 1.0 : strtod(*linePtr, linePtr);
V im = is_pattern ? 1.0 : strtod(*linePtr, linePtr);
std::complex<V> value = {re, im};
addValue(coo, value, indices, add_symmetric_value);
}

// Reads an element of a non-complex type for the current indices in coordinate
// scheme.
template <typename V,
typename std::enable_if<
!std::is_same<std::complex<float>, V>::value &&
!std::is_same<std::complex<double>, V>::value>::type * = nullptr>
static void inline readCOOValue(SparseTensorCOO<V> *coo,
const std::vector<uint64_t> indices,
char **linePtr, bool is_pattern,
bool is_symmetric_value) {
// The external formats always store these numerical values with the type
// double, but we cast these values to the sparse tensor object type.
// For a pattern tensor, we arbitrarily pick the value 1 for all entries.
double value = is_pattern ? 1.0 : strtod(*linePtr, linePtr);
addValue(coo, value, indices, is_symmetric_value);
}

/// Reads a sparse tensor with the given filename into a memory-resident
/// sparse tensor in coordinate scheme.
template <typename V>
static SparseTensorCOO<V> *
openSparseTensorCOO(char filename, uint64_t rank, const uint64_t shape,
const uint64_t *perm, PrimaryType valTp) {
SparseTensorFile stfile(filename);
stfile.openFile();
stfile.readHeader();
// Check tensor element type against the value type in the input file.
SparseTensorFile::ValueKind valueKind = stfile.getValueKind();
bool tensorIsInteger =
(valTp >= PrimaryType::kI64 && valTp <= PrimaryType::kI8);
bool tensorIsReal = (valTp >= PrimaryType::kF64 && valTp <= PrimaryType::kI8);
if ((valueKind == SparseTensorFile::ValueKind::kReal && tensorIsInteger) \|\|
(valueKind == SparseTensorFile::ValueKind::kComplex && tensorIsReal)) {
FATAL("Tensor element type %d not compatible with values in file %s\n",
static_cast<int>(valTp), filename);
}
stfile.assertMatchesShape(rank, shape);
// Prepare sparse tensor object with per-dimension sizes
// and the number of nonzeros as initial capacity.
uint64_t nnz = stfile.getNNZ();
auto *coo = SparseTensorCOO<V>::newSparseTensorCOO(rank, stfile.getDimSizes(),
perm, nnz);
// Read all nonzero elements.
std::vector<uint64_t> indices(rank);
for (uint64_t k = 0; k < nnz; k++) {
char *linePtr = stfile.readLine();
for (uint64_t r = 0; r < rank; r++) {
uint64_t idx = strtoul(linePtr, &linePtr, 10);
// Add 0-based index.
indices[perm[r]] = idx - 1;
}
readCOOValue(coo, indices, &linePtr, stfile.isPattern(),
stfile.isSymmetric() && indices[0] != indices[1]);
}
// Close the file and return tensor.
stfile.closeFile();
return coo;
}

/// Writes the sparse tensor to `dest` in extended FROSTT format.
template <typename V>
static void outSparseTensor(void tensor, void dest, bool sort) {
assert(tensor && dest);
auto coo = static_cast<SparseTensorCOO<V> *>(tensor);
if (sort)
coo->sort();
char filename = static_cast<char >(dest);
auto &dimSizes = coo->getDimSizes();
auto &elements = coo->getElements();
uint64_t rank = coo->getRank();
uint64_t nnz = elements.size();
std::fstream file;
file.open(filename, std::ios_base::out \| std::ios_base::trunc);
assert(file.is_open());
file << "; extended FROSTT format\n" << rank << " " << nnz << std::endl;
for (uint64_t r = 0; r < rank - 1; r++)
file << dimSizes[r] << " ";
file << dimSizes[rank - 1] << std::endl;
for (uint64_t i = 0; i < nnz; i++) {
auto &idx = elements[i].indices;
for (uint64_t r = 0; r < rank; r++)
file << (idx[r] + 1) << " ";
file << elements[i].value << std::endl;
}
file.flush();
file.close();
assert(file.good());
}

/// Initializes sparse tensor from an external COO-flavored format.		/// Initializes sparse tensor from an external COO-flavored format.
template <typename V>		template <typename V>
static SparseTensorStorage<uint64_t, uint64_t, V> *		static SparseTensorStorage<uint64_t, uint64_t, V> *
toMLIRSparseTensor(uint64_t rank, uint64_t nse, uint64_t shape, V values,		toMLIRSparseTensor(uint64_t rank, uint64_t nse, uint64_t shape, V values,
uint64_t indices, uint64_t perm, uint8_t *sparse) {		uint64_t indices, uint64_t perm, uint8_t *sparse) {
const DimLevelType sparsity = (DimLevelType )(sparse);		const DimLevelType sparsity = (DimLevelType )(sparse);
#ifndef NDEBUG		#ifndef NDEBUG
// Verify that perm is a permutation of 0..(rank-1).		// Verify that perm is a permutation of 0..(rank-1).
std::vector<uint64_t> order(perm, perm + rank);		std::vector<uint64_t> order(perm, perm + rank);
std::sort(order.begin(), order.end());		std::sort(order.begin(), order.end());
for (uint64_t i = 0; i < rank; ++i)		for (uint64_t i = 0; i < rank; ++i)
if (i != order[i])		if (i != order[i])
FATAL("Not a permutation of 0..%" PRIu64 "\n", rank);		MLIR_SPARSETENSOR_FATAL("Not a permutation of 0..%" PRIu64 "\n", rank);

// Verify that the sparsity values are supported.		// Verify that the sparsity values are supported.
for (uint64_t i = 0; i < rank; ++i)		for (uint64_t i = 0; i < rank; ++i)
if (sparsity[i] != DimLevelType::kDense &&		if (sparsity[i] != DimLevelType::kDense &&
sparsity[i] != DimLevelType::kCompressed)		sparsity[i] != DimLevelType::kCompressed)
FATAL("Unsupported sparsity value %d\n", static_cast<int>(sparsity[i]));		MLIR_SPARSETENSOR_FATAL("Unsupported sparsity value %d\n",
		static_cast<int>(sparsity[i]));
#endif		#endif

// Convert external format to internal COO.		// Convert external format to internal COO.
auto *coo = SparseTensorCOO<V>::newSparseTensorCOO(rank, shape, perm, nse);		auto *coo = SparseTensorCOO<V>::newSparseTensorCOO(rank, shape, perm, nse);
std::vector<uint64_t> idx(rank);		std::vector<uint64_t> idx(rank);
for (uint64_t i = 0, base = 0; i < nse; i++) {		for (uint64_t i = 0, base = 0; i < nse; i++) {
for (uint64_t r = 0; r < rank; r++)		for (uint64_t r = 0; r < rank; r++)
idx[perm[r]] = indices[base + r];		idx[perm[r]] = indices[base + r];
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// Public functions which operate on MLIR buffers (memrefs) to interact		// Public functions which operate on MLIR buffers (memrefs) to interact
// with sparse tensors (which are only visible as opaque pointers externally).		// with sparse tensors (which are only visible as opaque pointers externally).
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#define CASE(p, i, v, P, I, V) \		#define CASE(p, i, v, P, I, V) \
		aartbikUnsubmitted Done Reply Inline Actions This was before the ===- on purpose (there are two ===- headers inside the extern. By placing it here is seems to suggest only this "public functions" part is extern aartbik: This was before the //===- on purpose (there are two //===- headers inside the extern. By…
if (ptrTp == (p) && indTp == (i) && valTp == (v)) { \		if (ptrTp == (p) && indTp == (i) && valTp == (v)) { \
SparseTensorCOO<V> *coo = nullptr; \		SparseTensorCOO<V> *coo = nullptr; \
if (action <= Action::kFromCOO) { \		if (action <= Action::kFromCOO) { \
if (action == Action::kFromFile) { \		if (action == Action::kFromFile) { \
char filename = static_cast<char >(ptr); \		char filename = static_cast<char >(ptr); \
coo = openSparseTensorCOO<V>(filename, rank, shape, perm, v); \		coo = openSparseTensorCOO<V>(filename, rank, shape, perm, v); \
} else if (action == Action::kFromCOO) { \		} else if (action == Action::kFromCOO) { \
coo = static_cast<SparseTensorCOO<V> *>(ptr); \		coo = static_cast<SparseTensorCOO<V> *>(ptr); \
▲ Show 20 Lines • Show All 149 Lines • ▼ Show 20 Lines	_mlir_ciface_newSparseTensor(StridedMemRefType<DimLevelType, 1> *aref, // NOLINT
CASE_SECSAME(OverheadType::kU8, PrimaryType::kI8, uint8_t, int8_t);		CASE_SECSAME(OverheadType::kU8, PrimaryType::kI8, uint8_t, int8_t);

// Complex matrices with wide overhead.		// Complex matrices with wide overhead.
CASE_SECSAME(OverheadType::kU64, PrimaryType::kC64, uint64_t, complex64);		CASE_SECSAME(OverheadType::kU64, PrimaryType::kC64, uint64_t, complex64);
CASE_SECSAME(OverheadType::kU64, PrimaryType::kC32, uint64_t, complex32);		CASE_SECSAME(OverheadType::kU64, PrimaryType::kC32, uint64_t, complex32);

// Unsupported case (add above if needed).		// Unsupported case (add above if needed).
// TODO: better pretty-printing of enum values!		// TODO: better pretty-printing of enum values!
FATAL("unsupported combination of types: <P=%d, I=%d, V=%d>\n",		MLIR_SPARSETENSOR_FATAL(
		"unsupported combination of types: <P=%d, I=%d, V=%d>\n",
static_cast<int>(ptrTp), static_cast<int>(indTp),		static_cast<int>(ptrTp), static_cast<int>(indTp),
static_cast<int>(valTp));		static_cast<int>(valTp));
}		}
#undef CASE		#undef CASE
#undef CASE_SECSAME		#undef CASE_SECSAME

#define IMPL_SPARSEVALUES(VNAME, V) \		#define IMPL_SPARSEVALUES(VNAME, V) \
void _mlir_ciface_sparseValues##VNAME(StridedMemRefType<V, 1> *ref, \		void _mlir_ciface_sparseValues##VNAME(StridedMemRefType<V, 1> *ref, \
void *tensor) { \		void *tensor) { \
assert(ref &&tensor); \		assert(ref &&tensor); \
▲ Show 20 Lines • Show All 138 Lines • ▼ Show 20 Lines
FOREVERY_V(IMPL_DELCOO)		FOREVERY_V(IMPL_DELCOO)
#undef IMPL_DELCOO		#undef IMPL_DELCOO

char *getTensorFilename(index_type id) {		char *getTensorFilename(index_type id) {
char var[80];		char var[80];
sprintf(var, "TENSOR%" PRIu64, id);		sprintf(var, "TENSOR%" PRIu64, id);
char *env = getenv(var);		char *env = getenv(var);
if (!env)		if (!env)
FATAL("Environment variable %s is not set\n", var);		MLIR_SPARSETENSOR_FATAL("Environment variable %s is not set\n", var);
return env;		return env;
}		}

void readSparseTensorShape(char filename, std::vector<uint64_t> out) {		void readSparseTensorShape(char filename, std::vector<uint64_t> out) {
assert(out && "Received nullptr for out-parameter");		assert(out && "Received nullptr for out-parameter");
SparseTensorFile stfile(filename);		SparseTensorFile stfile(filename);
stfile.openFile();		stfile.openFile();
stfile.readHeader();		stfile.readHeader();
Show All 36 Lines

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel

Show First 20 Lines • Show All 2,058 Lines • ▼ Show 20 Lines	cc_library(
name = "SparseTensorTransforms",		name = "SparseTensorTransforms",
srcs = glob([		srcs = glob([
"lib/Dialect/SparseTensor/Transforms/*.cpp",		"lib/Dialect/SparseTensor/Transforms/*.cpp",
"lib/Dialect/SparseTensor/Transforms/*.h",		"lib/Dialect/SparseTensor/Transforms/*.h",
]),		]),
hdrs = [		hdrs = [
"include/mlir/Dialect/SparseTensor/Transforms/BufferizableOpInterfaceImpl.h",		"include/mlir/Dialect/SparseTensor/Transforms/BufferizableOpInterfaceImpl.h",
"include/mlir/Dialect/SparseTensor/Transforms/Passes.h",		"include/mlir/Dialect/SparseTensor/Transforms/Passes.h",
"include/mlir/ExecutionEngine/SparseTensorUtils.h",
],		],
includes = ["include"],		includes = ["include"],
deps = [		deps = [
":AffineDialect",		":AffineDialect",
":ArithDialect",		":ArithDialect",
":BufferizationDialect",		":BufferizationDialect",
":BufferizationTransforms",		":BufferizationTransforms",
":ComplexDialect",		":ComplexDialect",
Show All 10 Lines	deps = [
":SCFTransforms",		":SCFTransforms",
":SparseTensorDialect",		":SparseTensorDialect",
":SparseTensorPassIncGen",		":SparseTensorPassIncGen",
":SparseTensorUtils",		":SparseTensorUtils",
":Support",		":Support",
":TensorDialect",		":TensorDialect",
":Transforms",		":Transforms",
":VectorDialect",		":VectorDialect",
":mlir_c_runner_utils",		":mlir_sparse_tensor_utils",
"//llvm:Support",		"//llvm:Support",
],		],
)		)

cc_library(		cc_library(
name = "SparseTensorPipelines",		name = "SparseTensorPipelines",
srcs = glob(["lib/Dialect/SparseTensor/Pipelines/*.cpp"]),		srcs = glob(["lib/Dialect/SparseTensor/Pipelines/*.cpp"]),
hdrs = ["include/mlir/Dialect/SparseTensor/Pipelines/Passes.h"],		hdrs = ["include/mlir/Dialect/SparseTensor/Pipelines/Passes.h"],
▲ Show 20 Lines • Show All 4,440 Lines • ▼ Show 20 Lines
cc_binary(		cc_binary(
name = "libmlir_async_runtime.so",		name = "libmlir_async_runtime.so",
linkshared = True,		linkshared = True,
linkstatic = False,		linkstatic = False,
deps = [":mlir_async_runtime"],		deps = [":mlir_async_runtime"],
)		)

cc_library(		cc_library(
		name = "_mlir_float16_utils",
		srcs = ["lib/ExecutionEngine/Float16bits.cpp"],
		hdrs = ["include/mlir/ExecutionEngine/Float16bits.h"],
		copts = ["-Dmlir_float16_utils_EXPORTS"],
		includes = ["include"],
		)

		# Indirection to avoid 'libmlir_float16_utils.so' filename clash.
		alias(
		name = "mlir_float16_utils",
		actual = "_mlir_float16_utils",
		)

		cc_binary(
		name = "libmlir_float16_utils.so",
		linkshared = True,
		linkstatic = False,
		deps = [":mlir_float16_utils"],
		)

		# Unlike mlir_float16_utils, mlir_c_runner_utils, etc, we do not make
		# this a shared library: because on the CMake side, doing so causes
		# issues when building on Windows.
		#
		# We relist Float16bits.h because Enums.h includes it; rather than
		# forcing all direct-dependants state that they also directly-depend
		# on :mlir_float16_utils (to satisfy the layering_check).
		cc_library(
		name = "mlir_sparse_tensor_utils",
		srcs = [
		"lib/ExecutionEngine/SparseTensor/File.cpp",
		"lib/ExecutionEngine/SparseTensor/NNZ.cpp",
		"lib/ExecutionEngine/SparseTensor/Storage.cpp",
		],
		hdrs = [
		"include/mlir/ExecutionEngine/Float16bits.h",
		"include/mlir/ExecutionEngine/SparseTensor/COO.h",
		"include/mlir/ExecutionEngine/SparseTensor/CheckedMul.h",
		"include/mlir/ExecutionEngine/SparseTensor/Enums.h",
		"include/mlir/ExecutionEngine/SparseTensor/ErrorHandling.h",
		"include/mlir/ExecutionEngine/SparseTensor/File.h",
		"include/mlir/ExecutionEngine/SparseTensor/Storage.h",
		],
		copts = ["-Dmlir_sparse_tensor_utils_EXPORTS"],
		includes = ["include"],
		deps = [":mlir_float16_utils"],
		)

		# We relist Enums.h because SparseTensorUtils.h includes/reexports it
		# as part of the public API.
		cc_library(
name = "_mlir_c_runner_utils",		name = "_mlir_c_runner_utils",
srcs = [		srcs = [
"lib/ExecutionEngine/CRunnerUtils.cpp",		"lib/ExecutionEngine/CRunnerUtils.cpp",
"lib/ExecutionEngine/Float16bits.cpp",
"lib/ExecutionEngine/SparseTensorUtils.cpp",		"lib/ExecutionEngine/SparseTensorUtils.cpp",
],		],
hdrs = [		hdrs = [
"include/mlir/ExecutionEngine/CRunnerUtils.h",		"include/mlir/ExecutionEngine/CRunnerUtils.h",
"include/mlir/ExecutionEngine/Float16bits.h",
"include/mlir/ExecutionEngine/Msan.h",		"include/mlir/ExecutionEngine/Msan.h",
		"include/mlir/ExecutionEngine/SparseTensor/Enums.h",
"include/mlir/ExecutionEngine/SparseTensorUtils.h",		"include/mlir/ExecutionEngine/SparseTensorUtils.h",
],		],
includes = ["include"],		includes = ["include"],
		deps = [":mlir_sparse_tensor_utils"],
)		)

# Indirection to avoid 'libmlir_c_runner_utils.so' filename clash.		# Indirection to avoid 'libmlir_c_runner_utils.so' filename clash.
alias(		alias(
name = "mlir_c_runner_utils",		name = "mlir_c_runner_utils",
actual = "_mlir_c_runner_utils",		actual = "_mlir_c_runner_utils",
)		)

▲ Show 20 Lines • Show All 3,001 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse] refactoring SparseTensorUtils: (1 of 4) file-splittingClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 464033

mlir/include/mlir/ExecutionEngine/Float16bits.h

mlir/include/mlir/ExecutionEngine/SparseTensor/COO.h

mlir/include/mlir/ExecutionEngine/SparseTensor/CheckedMul.h

mlir/include/mlir/ExecutionEngine/SparseTensor/Enums.h

mlir/include/mlir/ExecutionEngine/SparseTensor/ErrorHandling.h

mlir/include/mlir/ExecutionEngine/SparseTensor/File.h

mlir/include/mlir/ExecutionEngine/SparseTensor/Storage.h

mlir/include/mlir/ExecutionEngine/SparseTensorUtils.h

mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp

mlir/lib/ExecutionEngine/CMakeLists.txt

mlir/lib/ExecutionEngine/Float16bits.cpp

mlir/lib/ExecutionEngine/SparseTensor/CMakeLists.txt

mlir/lib/ExecutionEngine/SparseTensor/File.cpp

mlir/lib/ExecutionEngine/SparseTensor/NNZ.cpp

mlir/lib/ExecutionEngine/SparseTensor/Storage.cpp

mlir/lib/ExecutionEngine/SparseTensorUtils.cpp

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel

[mlir][sparse] refactoring SparseTensorUtils: (1 of 4) file-splitting
ClosedPublic