This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/lib/ExecutionEngine/
-
lib/
-
ExecutionEngine/
-
SparseTensorUtils.cpp

Differential D122058

[mlir][sparse] Distinguishing "shape" from "sizes" in variable names
ClosedPublic

Authored by wrengr on Mar 18 2022, 8:04 PM.

Download Raw Diff

Details

Reviewers

aartbik
bixia
penpornk

Commits

rGd83a7068277e: [mlir][sparse] Distinguishing "shape" from "sizes" in variable names

Summary

I'm using "shape" to mean the compile-time object, where zeros indicate sizes which are compile-time dynamic; and using "sizes" to mean the run-time object, where zeros indicate a dimension with no coordinates (hence resulting in trivial storage). Because their semantics differ on zeros, it's important to keep them distinguished. Although we do not define separate C++ types to capture the distinction, we can at least use variable names to do so.

This is (tangential) work towards fixing: https://github.com/llvm/llvm-project/issues/51652

Depends On D122057

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

wrengr created this revision.Mar 18 2022, 8:04 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 18 2022, 8:04 PM

Herald added subscribers: sdasgup3, wenzhicui, Chia-hungDuan and 17 others. · View Herald Transcript

wrengr requested review of this revision.Mar 18 2022, 8:04 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 18 2022, 8:04 PM

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

wrengr added a child revision: D122059: [mlir][sparse] Marking several things const/static.Mar 18 2022, 8:06 PM

Harbormaster completed remote builds in B155166: Diff 416659.Mar 18 2022, 8:25 PM

I agree with the general direction, but am not so sure on size vs shape here.
Can we come up with something that keeps "size" for both cases, but makes the distinction.
Because in some case we compare one "size" with the other "size".
E.g. dynSize and staticSize but I am open for better stuff.

In D122058#3394384, @aartbik wrote:

I agree with the general direction, but am not so sure on size vs shape here.
Can we come up with something that keeps "size" for both cases, but makes the distinction.
Because in some case we compare one "size" with the other "size".
E.g. dynSize and staticSize but I am open for better stuff.

I just used "shape" since that's what MLIR uses elsewhere for its type-system's notion of the complete collection of compile-time sizes which are potentially of "unknown" (?) value. But I'm cool with other names. As for what names to use, I have a number of thoughts/concerns but can't quite organize them at the moment. So with no particular order nor organization:

(1) I'm fine with the "dyn" prefix. However, I think it's not entirely clear which one it should refer to: on the one hand, it could mean the run-time sizes of the SparseTensorStorage itself, since they are compile-time-dynamic entities; on the other hand, it could mean MLIR's type-system notion of shapes which may contain "unknown" (?) sizes. I haven't noticed the MLIR documentation being particularly clear/consistent about whether "dynamic shape" means the compile-time shape with unknown-sizes vs the run-time sizes which instantiate that compile-time shape; but if they have an official stance on the jargon then we should be sure to follow it.

(2) I don't like the "static" prefix: since in C++ "static" means compile-time-fixed, which doesn't cleanly apply to either of the things here: the run-time sizes aren't compile-time-fixed, obviously; and although MLIR's type-system shapes are compile-time entities, it's a bit peculiar to refer to a ranked shape with unknown sizes as "static" imo. I also dislike the prefix because "static" is terribly overloaded in C++.

(3) Once D122061 lands we will also want to make the distinction between dimension-sizes vs assembled-sizes (where the latter term is taken from TACO). Prior to that differential we do already have assembled-sizes, it's just that we don't really care very much about them so there's no pressing need to give them their own naming convention. Assembled-sizes are always compile-time-dynamic (since they're not captured by the MLIR type system nor by the C++ type system), and the only thing we ever compare them against is themselves (i.e., to ensure they haven't changed when mutating something else), so there's no analogue of the "shape"-vs-"sizes" situation for assembled-sizes.

Looking more carefully again, having the sizes array (plural) vs shape (singular, i.e. a tuple of sizes) actually makes a lot of sense to make the distinction.
So this is fine as is!

This revision is now accepted and ready to land.Mar 21 2022, 3:34 PM

wrengr removed a child revision: D122059: [mlir][sparse] Marking several things const/static.Mar 21 2022, 4:41 PM

rebase

Harbormaster completed remote builds in B155694: Diff 417392.Mar 22 2022, 2:07 PM

Closed by commit rGd83a7068277e: [mlir][sparse] Distinguishing "shape" from "sizes" in variable names (authored by wrengr). · Explain WhyMar 22 2022, 2:16 PM

This revision was automatically updated to reflect the committed changes.

wrengr added a commit: rGd83a7068277e: [mlir][sparse] Distinguishing "shape" from "sizes" in variable names.

Revision Contents

Path

Size

mlir/

lib/

ExecutionEngine/

SparseTensorUtils.cpp

26 lines

Diff 417402

mlir/lib/ExecutionEngine/SparseTensorUtils.cpp

Show First 20 Lines • Show All 148 Lines • ▼ Show 20 Lines	public:
/// the given ordering and expects subsequent add() calls to honor		/// the given ordering and expects subsequent add() calls to honor
/// that same ordering for the given indices. The result is a		/// that same ordering for the given indices. The result is a
/// fully permuted coordinate scheme.		/// fully permuted coordinate scheme.
static SparseTensorCOO<V> *newSparseTensorCOO(uint64_t rank,		static SparseTensorCOO<V> *newSparseTensorCOO(uint64_t rank,
const uint64_t *sizes,		const uint64_t *sizes,
const uint64_t *perm,		const uint64_t *perm,
uint64_t capacity = 0) {		uint64_t capacity = 0) {
std::vector<uint64_t> permsz(rank);		std::vector<uint64_t> permsz(rank);
for (uint64_t r = 0; r < rank; r++)		for (uint64_t r = 0; r < rank; r++) {
		assert(sizes[r] > 0 && "Dimension size zero has trivial storage");
permsz[perm[r]] = sizes[r];		permsz[perm[r]] = sizes[r];
		}
return new SparseTensorCOO<V>(permsz, capacity);		return new SparseTensorCOO<V>(permsz, capacity);
}		}

private:		private:
const std::vector<uint64_t> sizes; // per-dimension sizes		const std::vector<uint64_t> sizes; // per-dimension sizes
std::vector<Element<V>> elements;		std::vector<Element<V>> elements;
bool iteratorLocked;		bool iteratorLocked;
unsigned iteratorPos;		unsigned iteratorPos;
▲ Show 20 Lines • Show All 217 Lines • ▼ Show 20 Lines	public:
}		}

/// Factory method. Constructs a sparse tensor storage scheme with the given		/// Factory method. Constructs a sparse tensor storage scheme with the given
/// dimensions, permutation, and per-dimension dense/sparse annotations,		/// dimensions, permutation, and per-dimension dense/sparse annotations,
/// using the coordinate scheme tensor for the initial contents if provided.		/// using the coordinate scheme tensor for the initial contents if provided.
/// In the latter case, the coordinate scheme must respect the same		/// In the latter case, the coordinate scheme must respect the same
/// permutation as is desired for the new sparse tensor storage.		/// permutation as is desired for the new sparse tensor storage.
static SparseTensorStorage<P, I, V> *		static SparseTensorStorage<P, I, V> *
newSparseTensor(uint64_t rank, const uint64_t sizes, const uint64_t perm,		newSparseTensor(uint64_t rank, const uint64_t shape, const uint64_t perm,
const DimLevelType sparsity, SparseTensorCOO<V> tensor) {		const DimLevelType sparsity, SparseTensorCOO<V> tensor) {
SparseTensorStorage<P, I, V> *n = nullptr;		SparseTensorStorage<P, I, V> *n = nullptr;
if (tensor) {		if (tensor) {
assert(tensor->getRank() == rank);		assert(tensor->getRank() == rank);
for (uint64_t r = 0; r < rank; r++)		for (uint64_t r = 0; r < rank; r++)
assert(sizes[r] == 0 \|\| tensor->getSizes()[perm[r]] == sizes[r]);		assert(shape[r] == 0 \|\| shape[r] == tensor->getSizes()[perm[r]]);
n = new SparseTensorStorage<P, I, V>(tensor->getSizes(), perm, sparsity,		n = new SparseTensorStorage<P, I, V>(tensor->getSizes(), perm, sparsity,
tensor);		tensor);
delete tensor;		delete tensor;
} else {		} else {
std::vector<uint64_t> permsz(rank);		std::vector<uint64_t> permsz(rank);
for (uint64_t r = 0; r < rank; r++)		for (uint64_t r = 0; r < rank; r++) {
permsz[perm[r]] = sizes[r];		assert(shape[r] > 0 && "Dimension size zero has trivial storage");
		permsz[perm[r]] = shape[r];
		}
n = new SparseTensorStorage<P, I, V>(permsz, perm, sparsity);		n = new SparseTensorStorage<P, I, V>(permsz, perm, sparsity);
}		}
return n;		return n;
}		}

private:		private:
/// Appends the next free position of `indices[d]` to `pointers[d]`.		/// Appends the next free position of `indices[d]` to `pointers[d]`.
/// Thus, when called after inserting the last element of a segment,		/// Thus, when called after inserting the last element of a segment,
▲ Show 20 Lines • Show All 239 Lines • ▼ Show 20 Lines	static void readExtFROSTTHeader(FILE file, char filename, char *line,
}		}
fgets(line, kColWidth, file); // end of line		fgets(line, kColWidth, file); // end of line
}		}

/// Reads a sparse tensor with the given filename into a memory-resident		/// Reads a sparse tensor with the given filename into a memory-resident
/// sparse tensor in coordinate scheme.		/// sparse tensor in coordinate scheme.
template <typename V>		template <typename V>
static SparseTensorCOO<V> openSparseTensorCOO(char filename, uint64_t rank,		static SparseTensorCOO<V> openSparseTensorCOO(char filename, uint64_t rank,
const uint64_t *sizes,		const uint64_t *shape,
const uint64_t *perm) {		const uint64_t *perm) {
// Open the file.		// Open the file.
FILE *file = fopen(filename, "r");		FILE *file = fopen(filename, "r");
if (!file) {		if (!file) {
assert(filename && "Received nullptr for filename");		assert(filename && "Received nullptr for filename");
fprintf(stderr, "Cannot find file %s\n", filename);		fprintf(stderr, "Cannot find file %s\n", filename);
exit(1);		exit(1);
}		}
Show All 9 Lines	if (strstr(filename, ".mtx")) {
fprintf(stderr, "Unknown format %s\n", filename);		fprintf(stderr, "Unknown format %s\n", filename);
exit(1);		exit(1);
}		}
// Prepare sparse tensor object with per-dimension sizes		// Prepare sparse tensor object with per-dimension sizes
// and the number of nonzeros as initial capacity.		// and the number of nonzeros as initial capacity.
assert(rank == idata[0] && "rank mismatch");		assert(rank == idata[0] && "rank mismatch");
uint64_t nnz = idata[1];		uint64_t nnz = idata[1];
for (uint64_t r = 0; r < rank; r++)		for (uint64_t r = 0; r < rank; r++)
assert((sizes[r] == 0 \|\| sizes[r] == idata[2 + r]) &&		assert((shape[r] == 0 \|\| shape[r] == idata[2 + r]) &&
"dimension size mismatch");		"dimension size mismatch");
SparseTensorCOO<V> *tensor =		SparseTensorCOO<V> *tensor =
SparseTensorCOO<V>::newSparseTensorCOO(rank, idata + 2, perm, nnz);		SparseTensorCOO<V>::newSparseTensorCOO(rank, idata + 2, perm, nnz);
// Read all nonzero elements.		// Read all nonzero elements.
std::vector<uint64_t> indices(rank);		std::vector<uint64_t> indices(rank);
for (uint64_t k = 0; k < nnz; k++) {		for (uint64_t k = 0; k < nnz; k++) {
if (!fgets(line, kColWidth, file)) {		if (!fgets(line, kColWidth, file)) {
fprintf(stderr, "Cannot find next line of data in %s\n", filename);		fprintf(stderr, "Cannot find next line of data in %s\n", filename);
▲ Show 20 Lines • Show All 146 Lines • ▼ Show 20 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#define CASE(p, i, v, P, I, V) \		#define CASE(p, i, v, P, I, V) \
if (ptrTp == (p) && indTp == (i) && valTp == (v)) { \		if (ptrTp == (p) && indTp == (i) && valTp == (v)) { \
SparseTensorCOO<V> *tensor = nullptr; \		SparseTensorCOO<V> *tensor = nullptr; \
if (action <= Action::kFromCOO) { \		if (action <= Action::kFromCOO) { \
if (action == Action::kFromFile) { \		if (action == Action::kFromFile) { \
char filename = static_cast<char >(ptr); \		char filename = static_cast<char >(ptr); \
tensor = openSparseTensorCOO<V>(filename, rank, sizes, perm); \		tensor = openSparseTensorCOO<V>(filename, rank, shape, perm); \
} else if (action == Action::kFromCOO) { \		} else if (action == Action::kFromCOO) { \
tensor = static_cast<SparseTensorCOO<V> *>(ptr); \		tensor = static_cast<SparseTensorCOO<V> *>(ptr); \
} else { \		} else { \
assert(action == Action::kEmpty); \		assert(action == Action::kEmpty); \
} \		} \
return SparseTensorStorage<P, I, V>::newSparseTensor(rank, sizes, perm, \		return SparseTensorStorage<P, I, V>::newSparseTensor(rank, shape, perm, \
sparsity, tensor); \		sparsity, tensor); \
} \		} \
if (action == Action::kEmptyCOO) \		if (action == Action::kEmptyCOO) \
return SparseTensorCOO<V>::newSparseTensorCOO(rank, sizes, perm); \		return SparseTensorCOO<V>::newSparseTensorCOO(rank, shape, perm); \
tensor = static_cast<SparseTensorStorage<P, I, V> *>(ptr)->toCOO(perm); \		tensor = static_cast<SparseTensorStorage<P, I, V> *>(ptr)->toCOO(perm); \
if (action == Action::kToIterator) { \		if (action == Action::kToIterator) { \
tensor->startIterator(); \		tensor->startIterator(); \
} else { \		} else { \
assert(action == Action::kToCOO); \		assert(action == Action::kToCOO); \
} \		} \
return tensor; \		return tensor; \
}		}
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	_mlir_ciface_newSparseTensor(StridedMemRefType<DimLevelType, 1> *aref, // NOLINT
StridedMemRefType<index_type, 1> *pref,		StridedMemRefType<index_type, 1> *pref,
OverheadType ptrTp, OverheadType indTp,		OverheadType ptrTp, OverheadType indTp,
PrimaryType valTp, Action action, void *ptr) {		PrimaryType valTp, Action action, void *ptr) {
assert(aref && sref && pref);		assert(aref && sref && pref);
assert(aref->strides[0] == 1 && sref->strides[0] == 1 &&		assert(aref->strides[0] == 1 && sref->strides[0] == 1 &&
pref->strides[0] == 1);		pref->strides[0] == 1);
assert(aref->sizes[0] == sref->sizes[0] && sref->sizes[0] == pref->sizes[0]);		assert(aref->sizes[0] == sref->sizes[0] && sref->sizes[0] == pref->sizes[0]);
const DimLevelType *sparsity = aref->data + aref->offset;		const DimLevelType *sparsity = aref->data + aref->offset;
const index_type *sizes = sref->data + sref->offset;		const index_type *shape = sref->data + sref->offset;
const index_type *perm = pref->data + pref->offset;		const index_type *perm = pref->data + pref->offset;
uint64_t rank = aref->sizes[0];		uint64_t rank = aref->sizes[0];

// Rewrite kIndex to kU64, to avoid introducing a bunch of new cases.		// Rewrite kIndex to kU64, to avoid introducing a bunch of new cases.
// This is safe because of the static_assert above.		// This is safe because of the static_assert above.
if (ptrTp == OverheadType::kIndex)		if (ptrTp == OverheadType::kIndex)
ptrTp = OverheadType::kU64;		ptrTp = OverheadType::kU64;
if (indTp == OverheadType::kIndex)		if (indTp == OverheadType::kIndex)
▲ Show 20 Lines • Show All 279 Lines • Show Last 20 Lines