This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/SparseTensor/IR/
-
mlir/
-
Dialect/
-
SparseTensor/
-
IR/
-
SparseTensor.h
-
lib/Dialect/SparseTensor/Transforms/
-
Dialect/
-
SparseTensor/
-
Transforms/
10/11
CodegenUtils.h
1/1
CodegenUtils.cpp
3/5
SparseTensorCodegen.cpp

Differential D138627

[mlir][sparse] Refactoring: abstract sparse tensor memory scheme into a SparseTensorDescriptor class.
ClosedPublic

Authored by Peiming on Nov 23 2022, 4:50 PM.

Download Raw Diff

Details

Reviewers

aartbik
nicolasvasilache
wrengr
bixia

Commits

rG8a7e69d145ff: [mlir][sparse] Refactoring: abstract sparse tensor memory scheme into a…

Summary

This patch abstracts sparse tensor memory scheme into a SparseTensorDescriptor class. Previously, the field accesses are performed in a relatively error-prone way, this patch hides the hairy details behind a SparseTensorDescriptor class to allow users access sparse tensor fields in a more cohesive way.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

Peiming created this revision.Nov 23 2022, 4:50 PM

Herald added a reviewer: aartbik. · View Herald TranscriptNov 23 2022, 4:50 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: jsetoain, Moerafaat, anlunx and 21 others. · View Herald Transcript

Peiming requested review of this revision.Nov 23 2022, 4:50 PM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptNov 23 2022, 4:50 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

minor improvement.

minor fixes.

Harbormaster completed remote builds in B199331: Diff 477663.Nov 23 2022, 6:33 PM

Peiming added reviewers: wrengr, bixia.Nov 30 2022, 9:38 AM

code cleanup.

Herald added a subscriber: hanchung. · View Herald TranscriptNov 30 2022, 12:21 PM

Harbormaster completed remote builds in B200341: Diff 479047.Nov 30 2022, 12:41 PM

replaces more places using SparseTensorDescriptor.

Harbormaster completed remote builds in B200389: Diff 479113.Nov 30 2022, 5:33 PM

add some comments.

Harbormaster completed remote builds in B200431: Diff 479170.Nov 30 2022, 10:35 PM

replace all SmallVectorImpl to SparseTensorDescriptor

Peiming edited the summary of this revision. (Show Details)Dec 1 2022, 9:21 AM

minor fixes.

aartbik added inline comments.Dec 1 2022, 9:57 AM

mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.cpp
99	This description describes the layout and thus drives all the code. Can you please move this into the header, inside the class documentation.

add some comments.

cleanup.

Peiming added inline comments.Dec 1 2022, 11:31 AM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorCodegen.cpp
167	A better way to create PushBackOp can be `PushBackOp(builder, loc, descriptor, fidx, value)`, which is more clear and can be read as "push a value into the sparse tensor descriptor at the given field index). But it would require us to put `SparseTensorDescriptor` to a publicly available place (and I am not sure whether it is wanted).

Peiming added a child revision: D139141: [mlir][sparse] add getPointerType/getIndexType to SparseTensorEncodingAttr..Dec 1 2022, 1:25 PM

Harbormaster completed remote builds in B200573: Diff 479357.Dec 1 2022, 1:36 PM

Peiming removed a child revision: D139141: [mlir][sparse] add getPointerType/getIndexType to SparseTensorEncodingAttr..Dec 1 2022, 1:50 PM

rebase

cleanup.

code cleanup.

Harbormaster completed remote builds in B200632: Diff 479434.Dec 1 2022, 6:16 PM

rebase

aartbik added inline comments.Dec 2 2022, 4:29 PM

mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h
336	I think this line should not be here? Copied from somewhere else?
357	relying on ad-hoc
377	, and a (Oxford comma ;-)
387	an array
390	layouted -> laid out? (I think that is the proper english, not sure though...)
412	to restore (no d)
437	it feels like we have many more getters than we really need? am I right, is there a way to condense the API a bit?
441	Although this solution is much better at information hiding than the original, we now need to scan each time for every field query. Although this is never very deep (rank bounded), it is still a bit wasteful. Can we avoid this by precomputing offsets per dimenison?
448	this feels very convoluted? why do we need a dim for a value query?
mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorCodegen.cpp
251	note that this works for now, but we actually planned to implement a heuristic here, which will need to scan the ranks + level types again
273	continue
386	yeah, agreed, this made more sense in the original but the alternative is to copy and create a fully new array....

remove unused APIs + fix typos.

mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h
441	I also think about it, we could, but storing the map will make the class expensive to copy. How about I do it in next revision and change all the descriptor to reference?
448	deleted.

Peiming added inline comments.Dec 2 2022, 5:08 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorCodegen.cpp
251	Then, you can have another foreach before this to compute the heuristic.

Harbormaster completed remote builds in B200871: Diff 479780.Dec 2 2022, 6:23 PM

rebase.

aartbik accepted this revision.Dec 5 2022, 2:08 PM

This revision is now accepted and ready to land.Dec 5 2022, 2:08 PM

This revision was landed with ongoing or failed builds.Dec 5 2022, 2:12 PM

Closed by commit rG8a7e69d145ff: [mlir][sparse] Refactoring: abstract sparse tensor memory scheme into a… (authored by Peiming). · Explain Why

This revision was automatically updated to reflect the committed changes.

Peiming added a commit: rG8a7e69d145ff: [mlir][sparse] Refactoring: abstract sparse tensor memory scheme into a….

Harbormaster completed remote builds in B201146: Diff 480148.Dec 5 2022, 2:23 PM

@Peiming, this patch breaks the flang build.

In D138627#3972470, @PeteSteinfeld wrote:

@Peiming, this patch breaks the flang build.

It is not flang specific issue, though. The following gcc buildbot fails: https://lab.llvm.org/buildbot/#/builders/160/builds/13724

In D138627#3972495, @vzakhari wrote:

In D138627#3972470, @PeteSteinfeld wrote:

@Peiming, this patch breaks the flang build.

It is not flang specific issue, though. The following gcc buildbot fails: https://lab.llvm.org/buildbot/#/builders/160/builds/13724

This also broke the windows mlir buildbot. Please address the issues or revert.

In D138627#3972566, @stella.stamenova wrote:

In D138627#3972495, @vzakhari wrote:

In D138627#3972470, @PeteSteinfeld wrote:

@Peiming, this patch breaks the flang build.

It is not flang specific issue, though. The following gcc buildbot fails: https://lab.llvm.org/buildbot/#/builders/160/builds/13724

This also broke the windows mlir buildbot. Please address the issues or revert.

Okay, Investigating

The failure seems unrelated to this patch...

stella.stamenova added a reverting change: rG10033a179f0c: Revert "[mlir][sparse] Refactoring: abstract sparse tensor memory scheme into a….Dec 5 2022, 5:21 PM

In D138627#3972566, @stella.stamenova wrote:

In D138627#3972495, @vzakhari wrote:

In D138627#3972470, @PeteSteinfeld wrote:

@Peiming, this patch breaks the flang build.

It is not flang specific issue, though. The following gcc buildbot fails: https://lab.llvm.org/buildbot/#/builders/160/builds/13724

This also broke the windows mlir buildbot. Please address the issues or revert.

The windows failure should be fixed by https://reviews.llvm.org/D139383

In D138627#3972777, @Peiming wrote:

The failure seems unrelated to this patch...

The build log seems to directly point to your code (https://lab.llvm.org/buildbot/#/builders/160/builds/13724/steps/5/logs/stdio):

In file included from ../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/SparseBufferRewriting.cpp:14:
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:396:13: error: explicit specialization in non-namespace scope ‘class mlir::sparse_tensor::SparseTensorDescriptorImpl<mut>’
  396 |   template <>
      |             ^
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:397:10: error: too few template-parameter-lists
  397 |   struct ArrayStorage<false> {
      |          ^~~~~~~~~~~~~~~~~~~
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:401:13: error: explicit specialization in non-namespace scope ‘class mlir::sparse_tensor::SparseTensorDescriptorImpl<mut>’
  401 |   template <>
      |             ^
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:402:10: error: too few template-parameter-lists
  402 |   struct ArrayStorage<true> {
      |          ^~~~~~~~~~~~~~~~~~

In D138627#3972803, @vzakhari wrote:

In D138627#3972777, @Peiming wrote:

The failure seems unrelated to this patch...

The build log seems to directly point to your code (https://lab.llvm.org/buildbot/#/builders/160/builds/13724/steps/5/logs/stdio):

In file included from ../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/SparseBufferRewriting.cpp:14:
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:396:13: error: explicit specialization in non-namespace scope ‘class mlir::sparse_tensor::SparseTensorDescriptorImpl<mut>’
  396 |   template <>
      |             ^
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:397:10: error: too few template-parameter-lists
  397 |   struct ArrayStorage<false> {
      |          ^~~~~~~~~~~~~~~~~~~
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:401:13: error: explicit specialization in non-namespace scope ‘class mlir::sparse_tensor::SparseTensorDescriptorImpl<mut>’
  401 |   template <>
      |             ^
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:402:10: error: too few template-parameter-lists
  402 |   struct ArrayStorage<true> {
      |          ^~~~~~~~~~~~~~~~~~

Okay, I did not see this. I will try with GCC

@Peiming : I reverted your change because it broke multiple buildbots - windows and gcc included. I see that you've re-committed it. Did you make sure to address all three build breaks that were reported including the one @vzakhari pointed out? In cases like this, please make sure NOT to recommit your changes without making sure that all build breaks are addressed as it is disruptive to have buildbots that are broken especially when a change is identified as the source.

In D138627#3972826, @stella.stamenova wrote:

@Peiming : I reverted your change because it broke multiple buildbots - windows and gcc included. I see that you've re-committed it. Did you make sure to address all three build breaks that were reported including the one @vzakhari pointed out? In cases like this, please make sure NOT to recommit your changes without making sure that all build breaks are addressed as it is disruptive to have buildbots that are broken especially when a change is identified as the source.

Yes, of course. I haven't push the code, waiting the pre-merge windows build to finish.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

SparseTensor/

IR/

SparseTensor.h

15 lines

lib/

Dialect/

SparseTensor/

Transforms/

CodegenUtils.h

307 lines

CodegenUtils.cpp

112 lines

SparseTensorCodegen.cpp

488 lines

Diff 479357

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensor.h

Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	inline bool isCompressedDim(RankedTensorType type, uint64_t d) {
return isCompressedDLT(getDimLevelType(type, d));		return isCompressedDLT(getDimLevelType(type, d));
}		}

/// Convenience function to test for singleton dimension (0 <= d < rank).		/// Convenience function to test for singleton dimension (0 <= d < rank).
inline bool isSingletonDim(RankedTensorType type, uint64_t d) {		inline bool isSingletonDim(RankedTensorType type, uint64_t d) {
return isSingletonDLT(getDimLevelType(type, d));		return isSingletonDLT(getDimLevelType(type, d));
}		}

		/// Convenience function to test for dense dimension (0 <= d < rank).
		inline bool isDenseDim(SparseTensorEncodingAttr enc, uint64_t d) {
		return isDenseDLT(getDimLevelType(enc, d));
		}

		/// Convenience function to test for compressed dimension (0 <= d < rank).
		inline bool isCompressedDim(SparseTensorEncodingAttr enc, uint64_t d) {
		return isCompressedDLT(getDimLevelType(enc, d));
		}

		/// Convenience function to test for singleton dimension (0 <= d < rank).
		inline bool isSingletonDim(SparseTensorEncodingAttr enc, uint64_t d) {
		return isSingletonDLT(getDimLevelType(enc, d));
		}

//		//
// Dimension level properties.		// Dimension level properties.
//		//

/// Convenience function to test for ordered property in the		/// Convenience function to test for ordered property in the
/// given dimension (0 <= d < rank).		/// given dimension (0 <= d < rank).
inline bool isOrderedDim(RankedTensorType type, uint64_t d) {		inline bool isOrderedDim(RankedTensorType type, uint64_t d) {
return isOrderedDLT(getDimLevelType(type, d));		return isOrderedDLT(getDimLevelType(type, d));
Show All 27 Lines

mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h

Show First 20 Lines • Show All 305 Lines • ▼ Show 20 Lines
}		}

inline bool isZeroRankedTensorOrScalar(Type type) {		inline bool isZeroRankedTensorOrScalar(Type type) {
auto rtp = type.dyn_cast<RankedTensorType>();		auto rtp = type.dyn_cast<RankedTensorType>();
return !rtp \|\| rtp.getRank() == 0;		return !rtp \|\| rtp.getRank() == 0;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// SparseTensorLoopEmiter class, manages sparse tensors and helps to generate		// SparseTensorDescriptor and helpers, manage the sparse tensor memory layout
// loop structure to (co)-iterate sparse tensors.		// scheme.
		//
		// Sparse tensor storage scheme for rank-dimensional tensor is organized
		// as a single compound type with the following fields. Note that every
		// memref with ? size actually behaves as a "vector", i.e. the stored
		// size is the capacity and the used size resides in the memSizes array.
		//
		// struct {
		// memref<rank x index> dimSizes ; size in each dimension
		// memref<n x index> memSizes ; sizes of ptrs/inds/values
		// ; per-dimension d:
		// ; if dense:
		// <nothing>
		// ; if compresed:
		// memref<? x ptr> pointers-d ; pointers for sparse dim d
		// memref<? x idx> indices-d ; indices for sparse dim d
		// ; if singleton:
		// memref<? x idx> indices-d ; indices for singleton dim d
		// memref<? x eltType> values ; values
		// };
		//
		// The dimSizes array and memSizes array.
		aartbikUnsubmitted Done Reply Inline Actions I think this line should not be here? Copied from somewhere else? aartbik: I think this line should not be here? Copied from somewhere else?
		//
		//===----------------------------------------------------------------------===//
		enum class SparseTensorFieldKind {
		DimSizes,
		MemSizes,
		PtrMemRef,
		IdxMemRef,
		ValMemRef
		};

		constexpr uint64_t dimSizesIdx = 0;
		constexpr uint64_t memSizesIdx = dimSizesIdx + 1;
		constexpr uint64_t dataFieldIdx = memSizesIdx + 1;

		/// For each field that will be allocated for the given sparse tensor encoding,
		/// calls the callback with the corresponding field index, field kind, dimension
		/// (for sparse tensor level memrefs) and dimlevelType.
		/// The field index always starts with zero and increments by one between two
		/// callback invocations.
		/// Ideally, all other methods should rely on this function to query a sparse
		/// tensor fields instead of relying ad-hoc index computation.
		aartbikUnsubmitted Done Reply Inline Actions relying on ad-hoc aartbik: relying on ad-hoc
		void foreachFieldInSparseTensor(
		SparseTensorEncodingAttr,
		llvm::function_ref<bool(unsigned /fieldIdx/,
		SparseTensorFieldKind /fieldKind/,
		unsigned /dim (if applicable)/,
		DimLevelType /DLT (if applicable)/)>);

		/// Same as above, except that it also builds the Type for the corresponding
		/// field.
		void foreachFieldAndTypeInSparseTensor(
		RankedTensorType,
		llvm::function_ref<bool(Type /fieldType/, unsigned /fieldIdx/,
		SparseTensorFieldKind /fieldKind/,
		unsigned /dim (if applicable)/,
		DimLevelType /DLT (if applicable)/)>);

		/// Gets the total number of fields for the given sparse tensor encoding.
		unsigned getNumFieldsFromEncoding(SparseTensorEncodingAttr enc);

		/// Gets the total number of data fields (index arrays, pointer arrays and a
		aartbikUnsubmitted Done Reply Inline Actions , and a (Oxford comma ;-) aartbik: , and a (Oxford comma ;-)
		/// value array) for the given sparse tensor encoding.
		unsigned getNumDataFieldsFromEncoding(SparseTensorEncodingAttr enc);

		/// Get the index of the field in memSizes (only valid for data fields).
		inline unsigned getFieldMemSizesIndex(unsigned fid) {
		assert(fid >= dataFieldIdx);
		return fid - dataFieldIdx;
		}

		/// A helper class around a array of values that corresponding to a sparse
		aartbikUnsubmitted Done Reply Inline Actions an array aartbik: an array
		/// tensor, provides a set of meaningful APIs to query and update a particular
		/// field in a consistent way.
		/// Users should not make assumption on how a sparse tensor is layouted but
		aartbikUnsubmitted Done Reply Inline Actions layouted -> laid out? (I think that is the proper english, not sure though...) aartbik: layouted -> laid out? (I think that is the proper english, not sure though...)
		/// instead relies on this class to access the right value for the right field.
		template <bool mut>
		class SparseTensorDescriptorImpl {
		private:
		template <bool>
		struct ArrayStorage;

		template <>
		struct ArrayStorage<false> {
		using ValueArray = ValueRange;
		};

		template <>
		struct ArrayStorage<true> {
		using ValueArray = SmallVectorImpl<Value> &;
		};

		// Uses ValueRange for immuatable descriptors; uses SmallVectorImpl<Value> &
		// for mutable descriptors.
		// Using SmallVector for mutable descriptor allows users to reuse it as a tmp
		// buffers to append value for some special cases, though users should be
		// responsible to restored the buffer to legal states after their use. It is
		aartbikUnsubmitted Done Reply Inline Actions to restore (no d) aartbik: to restore (no d)
		// probably not a clean way, but it is the most efficient way to avoid copying
		// the fields into another SmallVector. If a more clear way is wanted, we
		// should change it to MutableArrayRef instead.
		using Storage = typename ArrayStorage<mut>::ValueArray;

		public:
		SparseTensorDescriptorImpl(Type tp, Storage fields)
		: rType(tp.cast<RankedTensorType>()), fields(fields) {
		assert(getSparseTensorEncoding(tp) &&
		getNumFieldsFromEncoding(getSparseTensorEncoding(tp)) ==
		fields.size());
		// We should make sure the class is trivially copyable (and should be small
		// enough) such that we can pass it by value.
		static_assert(
		std::is_trivially_copyable_v<SparseTensorDescriptorImpl<mut>>);
		}

		// Implicit (and cheap) type conversion from MutSparseTensorDescriptor to
		// SparseTensorDescriptor.
		template <typename T = SparseTensorDescriptorImpl<true>>
		/implicit/ SparseTensorDescriptorImpl(std::enable_if_t<!mut, T> &mDesc)
		: rType(mDesc.getTensorType()), fields(mDesc.getFields()) {}

		///
		/// Getters: get the field index for required field.
		aartbikUnsubmitted Done Reply Inline Actions it feels like we have many more getters than we really need? am I right, is there a way to condense the API a bit? aartbik: it feels like we have many more getters than we really need? am I right, is there a way to…
		///

		unsigned getPtrMemRefIndex(unsigned ptrDim) const {
		return getFieldIndex(ptrDim, SparseTensorFieldKind::PtrMemRef);
		aartbikUnsubmitted Not Done Reply Inline Actions Although this solution is much better at information hiding than the original, we now need to scan each time for every field query. Although this is never very deep (rank bounded), it is still a bit wasteful. Can we avoid this by precomputing offsets per dimenison? aartbik: Although this solution is much better at information hiding than the original, we now need to…
		PeimingAuthorUnsubmitted Done Reply Inline Actions I also think about it, we could, but storing the map will make the class expensive to copy. How about I do it in next revision and change all the descriptor to reference? Peiming: I also think about it, we could, but storing the map will make the class expensive to copy.
		}

		unsigned getIdxMemRefIndex(unsigned idxDim) const {
		return getFieldIndex(idxDim, SparseTensorFieldKind::IdxMemRef);
		}

		unsigned getDataFieldIndex(unsigned dim, SparseTensorFieldKind kind) const {
		aartbikUnsubmitted Done Reply Inline Actions this feels very convoluted? why do we need a dim for a value query? aartbik: this feels very convoluted? why do we need a dim for a value query?
		PeimingAuthorUnsubmitted Done Reply Inline Actions deleted. Peiming: deleted.
		if (kind == SparseTensorFieldKind::ValMemRef)
		return getValMemRefIndex();
		return getFieldIndex(dim, kind);
		}

		unsigned getValMemRefIndex() const { return fields.size() - 1; }

		unsigned getPtrMemSizesIndex(unsigned dim) const {
		return getPtrMemRefIndex(dim) - dataFieldIdx;
		}

		unsigned getIdxMemSizesIndex(unsigned dim) const {
		return getIdxMemRefIndex(dim) - dataFieldIdx;
		}

		unsigned getValMemSizesIndex() const {
		return getValMemRefIndex() - dataFieldIdx;
		}

		unsigned getFieldMemSizesIndex(unsigned dim,
		SparseTensorFieldKind kind) const {
		return getFieldMemSizesIndex(getDataFieldIndex(dim, kind));
		}

		unsigned getNumFields() const { return fields.size(); }

		///
		/// Getters: get the value for required field.
		///

		Value getDimSizesMemRef() const { return fields[dimSizesIdx]; }
		Value getMemSizesMemRef() const { return fields[memSizesIdx]; }

		Value getPtrMemRef(unsigned ptrDim) const {
		return fields[getPtrMemRefIndex(ptrDim)];
		}

		Value getIdxMemRef(unsigned idxDim) const {
		return fields[getIdxMemRefIndex(idxDim)];
		}

		Value getValMemRef() const { return fields[getValMemRefIndex()]; }

		Value getField(unsigned fid) const {
		assert(fid < fields.size());
		return fields[fid];
		}

		///
		/// Setters: update the value for required field (only enabled for
		/// MutSparseTensorDescriptor).
		///

		template <typename T = Value>
		void setDimSizesMemRef(std::enable_if_t<mut, T> v) {
		fields[dimSizesIdx] = v;
		}

		template <typename T = Value>
		void setMemSizesMemRef(std::enable_if_t<mut, T> v) {
		fields[memSizesIdx] = v;
		}

		template <typename T = Value>
		void setPtrMemRef(unsigned ptrDim, std::enable_if_t<mut, T> v) {
		fields[getPtrMemRefIndex(ptrDim)] = v;
		}

		template <typename T = Value>
		void setIdxMemRef(unsigned idxDim, std::enable_if_t<mut, T> v) {
		fields[getIdxMemRefIndex(idxDim)] = v;
		}

		template <typename T = Value>
		void setLvlMemRef(unsigned dim, SparseTensorFieldKind kind,
		std::enable_if_t<mut, T> v) {
		fields[getDataFieldIndex(dim, kind)] = v;
		}

		template <typename T = Value>
		void setValMemRef(std::enable_if_t<mut, T> v) {
		fields[getValMemRefIndex()] = v;
		}

		template <typename T = Value>
		void setField(unsigned fid, std::enable_if_t<mut, T> v) {
		assert(fid < fields.size());
		fields[fid] = v;
		}

		RankedTensorType getTensorType() const { return rType; }
		Storage getFields() const { return fields; }

		Type getElementType(unsigned fidx) const {
		return fields[fidx].getType().template cast<MemRefType>().getElementType();
		}

		// TODO: a better places for these functions should be in
		// SparseTensorEncodingAttr.
		Type getPtrElementType() const {
		auto *ctx = rType.getContext();
		unsigned ptrWidth = getSparseTensorEncoding(rType).getPointerBitWidth();
		Type indexType = IndexType::get(ctx);
		return ptrWidth ? IntegerType::get(ctx, ptrWidth) : indexType;
		}

		Type getIdxElementType() const {
		auto *ctx = rType.getContext();
		unsigned idxWidth = getSparseTensorEncoding(rType).getIndexBitWidth();
		Type indexType = IndexType::get(ctx);
		return idxWidth ? IntegerType::get(ctx, idxWidth) : indexType;
		}

		private:
		unsigned getFieldIndex(unsigned dim, SparseTensorFieldKind kind) const {
		unsigned fieldIdx = -1u;
		foreachFieldInSparseTensor(
		getSparseTensorEncoding(rType),
		[dim, kind, &fieldIdx](unsigned fIdx, SparseTensorFieldKind fKind,
		unsigned fDim, DimLevelType dlt) -> bool {
		if (fDim == dim && kind == fKind) {
		fieldIdx = fIdx;
		// Returns false to break the iteration.
		return false;
		}
		return true;
		});
		assert(fieldIdx != -1u);
		return fieldIdx;
		}

		RankedTensorType rType;
		Storage fields;
		};

		using SparseTensorDescriptor = SparseTensorDescriptorImpl<false>;
		using MutSparseTensorDescriptor = SparseTensorDescriptorImpl<true>;

		//===----------------------------------------------------------------------===//
		// SparseTensorLoopEmiter class, manages sparse tensors and helps to
		// generate loop structure to (co)-iterate sparse tensors.
//		//
// An example usage:		// An example usage:
// To generate the following loops over T1<?x?> and T2<?x?>		// To generate the following loops over T1<?x?> and T2<?x?>
//		//
// for i in TENSOR_1_0 {		// for i in TENSOR_1_0 {
// for j : TENSOR_2_0 {		// for j : TENSOR_2_0 {
// for k : TENSOR_1_1 {}		// for k : TENSOR_1_1 {}
// for k : TENSOR_2_1 {}		// for k : TENSOR_2_1 {}
Show All 16 Lines

class SparseTensorLoopEmitter {		class SparseTensorLoopEmitter {
public:		public:
/// Optional callback function to setup dense output tensors when		/// Optional callback function to setup dense output tensors when
/// initializing the loop emitter (e.g., to fill a dense output with zeros).		/// initializing the loop emitter (e.g., to fill a dense output with zeros).
using OutputUpdater = function_ref<Value(OpBuilder &builder, Location loc,		using OutputUpdater = function_ref<Value(OpBuilder &builder, Location loc,
Value memref, Value tensor)>;		Value memref, Value tensor)>;

/// Constructor: take an array of tensors inputs, on which the generated loops		/// Constructor: take an array of tensors inputs, on which the generated
/// will iterate on. The index of the tensor in the array is also the		/// loops will iterate on. The index of the tensor in the array is also the
/// tensor id (tid) used in related functions.		/// tensor id (tid) used in related functions.
/// If isSparseOut is set, loop emitter assume that the sparse output tensor		/// If isSparseOut is set, loop emitter assume that the sparse output tensor
/// is empty, and will always generate loops on it based on the dim sizes.		/// is empty, and will always generate loops on it based on the dim sizes.
/// An optional array could be provided (by sparsification) to indicate the		/// An optional array could be provided (by sparsification) to indicate the
/// loop id sequence that will be generated. It is used to establish the		/// loop id sequence that will be generated. It is used to establish the
/// mapping between affineDimExpr to the corresponding loop index in the loop		/// mapping between affineDimExpr to the corresponding loop index in the
/// stack that are maintained by the loop emitter.		/// loop stack that are maintained by the loop emitter.
explicit SparseTensorLoopEmitter(ValueRange tensors,		explicit SparseTensorLoopEmitter(ValueRange tensors,
StringAttr loopTag = nullptr,		StringAttr loopTag = nullptr,
bool hasOutput = false,		bool hasOutput = false,
bool isSparseOut = false,		bool isSparseOut = false,
ArrayRef<unsigned> topSort = {});		ArrayRef<unsigned> topSort = {});

/// Starts a loop emitting session by generating all the buffers needed to		/// Starts a loop emitting session by generating all the buffers needed to
/// iterate tensors.		/// iterate tensors.
void initializeLoopEmit(OpBuilder &builder, Location loc,		void initializeLoopEmit(OpBuilder &builder, Location loc,
OutputUpdater updater = nullptr);		OutputUpdater updater = nullptr);

/// Generates a list of operations to compute the affine expression.		/// Generates a list of operations to compute the affine expression.
Value genAffine(OpBuilder &builder, AffineExpr a, Location loc);		Value genAffine(OpBuilder &builder, AffineExpr a, Location loc);

/// Enters a new loop sequence, the loops within the same sequence starts from		/// Enters a new loop sequence, the loops within the same sequence starts
/// the break points of previous loop instead of starting over from 0.		/// from the break points of previous loop instead of starting over from 0.
/// e.g.,		/// e.g.,
/// {		/// {
/// // loop sequence start.		/// // loop sequence start.
/// p0 = while(xxx)		/// p0 = while(xxx)
/// ...		/// ...
/// break p0		/// break p0
///		///
/// // Starts loop from p0		/// // Starts loop from p0
▲ Show 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	private:
/// will be transformed into		/// will be transformed into
/// %ret = parallel () init(%args) {		/// %ret = parallel () init(%args) {
/// ...		/// ...
/// scf.reduce(%c) bb0(%0, %1){		/// scf.reduce(%c) bb0(%0, %1){
/// %val = op %0, %1		/// %val = op %0, %1
/// scf.reduce.return %val		/// scf.reduce.return %val
/// }		/// }
/// }		/// }
/// NOTE: only one instruction will be moved into reduce block, transformation		/// NOTE: only one instruction will be moved into reduce block,
/// will fail if multiple instructions are used to compute the reduction		/// transformation will fail if multiple instructions are used to compute
/// value.		/// the reduction value. Return %ret to user, while %val is provided by
/// Return %ret to user, while %val is provided by users (`reduc`).		/// users (`reduc`).
void exitForLoop(RewriterBase &rewriter, Location loc,		void exitForLoop(RewriterBase &rewriter, Location loc,
MutableArrayRef<Value> reduc);		MutableArrayRef<Value> reduc);

/// Exits a while loop, returns the reduction results.		/// Exits a while loop, returns the reduction results.
void exitCoIterationLoop(OpBuilder &builder, Location loc,		void exitCoIterationLoop(OpBuilder &builder, Location loc,
MutableArrayRef<Value> reduc);		MutableArrayRef<Value> reduc);

/// A optional string attribute that should be attached to the loop generated		/// A optional string attribute that should be attached to the loop
/// by loop emitter, it might help following passes to identify loops that		/// generated by loop emitter, it might help following passes to identify
/// operates on sparse tensors more easily.		/// loops that operates on sparse tensors more easily.
StringAttr loopTag;		StringAttr loopTag;
/// Whether the loop emitter needs to treat the last tensor as the output		/// Whether the loop emitter needs to treat the last tensor as the output
/// tensor.		/// tensor.
bool hasOutput;		bool hasOutput;
bool isSparseOut;		bool isSparseOut;
/// Input and (optional) output tensors.		/// Input and (optional) output tensors.
std::vector<Value> tensors;		std::vector<Value> tensors;
/// The dim type array for each tensor.		/// The dim type array for each tensor.
std::vector<std::vector<DimLevelType>> dimTypes;		std::vector<std::vector<DimLevelType>> dimTypes;
/// Sparse iteration information (by tensor and dim). These arrays		/// Sparse iteration information (by tensor and dim). These arrays
/// are updated to remain current within the current loop.		/// are updated to remain current within the current loop.
std::vector<std::vector<Value>> pidxs;		std::vector<std::vector<Value>> pidxs;
std::vector<std::vector<Value>> coord;		std::vector<std::vector<Value>> coord;
std::vector<std::vector<Value>> highs;		std::vector<std::vector<Value>> highs;
std::vector<std::vector<Value>> ptrBuffer; // to_pointers		std::vector<std::vector<Value>> ptrBuffer; // to_pointers
std::vector<std::vector<Value>> idxBuffer; // to_indices		std::vector<std::vector<Value>> idxBuffer; // to_indices
std::vector<Value> valBuffer; // to_value		std::vector<Value> valBuffer; // to_value

// Loop Stack, stores the information of all the nested loops that are alive.		// Loop Stack, stores the information of all the nested loops that are
		// alive.
std::vector<LoopLevelInfo> loopStack;		std::vector<LoopLevelInfo> loopStack;

// Loop Sequence Stack, stores the unversial index for the current loop		// Loop Sequence Stack, stores the unversial index for the current loop
// sequence.		// sequence.
std::vector<Value> loopSeqStack;		std::vector<Value> loopSeqStack;

// Maps AffineDimExpr to the index of the loop in loopStack.		// Maps AffineDimExpr to the index of the loop in loopStack.
// TODO: We should probably use a callback function here to make it more		// TODO: We should probably use a callback function here to make it more
Show All 12 Lines

mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.cpp

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	static Value genIndexAndValueForDense(OpBuilder &builder, Location loc,
Value tensor,		Value tensor,
SmallVectorImpl<Value> &indicesArray,		SmallVectorImpl<Value> &indicesArray,
ValueRange ivs) {		ValueRange ivs) {
Value val = genValueForDense(builder, loc, tensor, ivs);		Value val = genValueForDense(builder, loc, tensor, ivs);
indicesArray.append(ivs.begin(), ivs.end());		indicesArray.append(ivs.begin(), ivs.end());
return val;		return val;
}		}

		void sparse_tensor::foreachFieldInSparseTensor(
		const SparseTensorEncodingAttr enc,
		llvm::function_ref<bool(unsigned, SparseTensorFieldKind, unsigned,
		DimLevelType)>
		callback) {
		assert(enc);

		aartbikUnsubmitted Done Reply Inline Actions This description describes the layout and thus drives all the code. Can you please move this into the header, inside the class documentation. aartbik: This description describes the layout and thus drives all the code. Can you please move this…
		#define RETURN_ON_FALSE(idx, kind, dim, dlt) \
		if (!(callback(idx, kind, dim, dlt))) \
		return;

		RETURN_ON_FALSE(dimSizesIdx, SparseTensorFieldKind::DimSizes, -1u,
		DimLevelType::Undef);
		RETURN_ON_FALSE(memSizesIdx, SparseTensorFieldKind::MemSizes, -1u,
		DimLevelType::Undef);

		static_assert(dataFieldIdx == memSizesIdx + 1);
		unsigned fieldIdx = dataFieldIdx;
		// Per-dimension storage.
		for (unsigned r = 0, rank = enc.getDimLevelType().size(); r < rank; r++) {
		// Dimension level types apply in order to the reordered dimension.
		// As a result, the compound type can be constructed directly in the given
		// order.
		auto dlt = getDimLevelType(enc, r);
		if (isCompressedDLT(dlt)) {
		RETURN_ON_FALSE(fieldIdx++, SparseTensorFieldKind::PtrMemRef, r, dlt);
		RETURN_ON_FALSE(fieldIdx++, SparseTensorFieldKind::IdxMemRef, r, dlt);
		} else if (isSingletonDLT(dlt)) {
		RETURN_ON_FALSE(fieldIdx++, SparseTensorFieldKind::IdxMemRef, r, dlt);
		} else {
		assert(isDenseDLT(dlt)); // no fields
		}
		}
		// The values array.
		RETURN_ON_FALSE(fieldIdx++, SparseTensorFieldKind::ValMemRef, -1u,
		DimLevelType::Undef);

		#undef RETURN_ON_FALSE
		}

		void sparse_tensor::foreachFieldAndTypeInSparseTensor(
		RankedTensorType rType,
		llvm::function_ref<bool(Type, unsigned, SparseTensorFieldKind, unsigned,
		DimLevelType)>
		callback) {
		auto enc = getSparseTensorEncoding(rType);
		assert(enc);
		// Construct the basic types.
		auto *context = rType.getContext();
		unsigned idxWidth = enc.getIndexBitWidth();
		unsigned ptrWidth = enc.getPointerBitWidth();
		Type indexType = IndexType::get(context);
		Type idxType = idxWidth ? IntegerType::get(context, idxWidth) : indexType;
		Type ptrType = ptrWidth ? IntegerType::get(context, ptrWidth) : indexType;
		Type eltType = rType.getElementType();
		unsigned rank = rType.getShape().size();
		// memref<rank x index> dimSizes
		Type dimSizeType = MemRefType::get({rank}, indexType);
		// memref<n x index> memSizes
		Type memSizeType =
		MemRefType::get({getNumDataFieldsFromEncoding(enc)}, indexType);
		// memref<? x ptr> pointers
		Type ptrMemType = MemRefType::get({ShapedType::kDynamic}, ptrType);
		// memref<? x idx> indices
		Type idxMemType = MemRefType::get({ShapedType::kDynamic}, idxType);
		// memref<? x eltType> values
		Type valMemType = MemRefType::get({ShapedType::kDynamic}, eltType);

		foreachFieldInSparseTensor(
		enc,
		[dimSizeType, memSizeType, ptrMemType, idxMemType, valMemType,
		callback](unsigned fieldIdx, SparseTensorFieldKind fieldKind,
		unsigned dim, DimLevelType dlt) -> bool {
		switch (fieldKind) {
		case SparseTensorFieldKind::DimSizes:
		return callback(dimSizeType, fieldIdx, fieldKind, dim, dlt);
		case SparseTensorFieldKind::MemSizes:
		return callback(memSizeType, fieldIdx, fieldKind, dim, dlt);
		case SparseTensorFieldKind::PtrMemRef:
		return callback(ptrMemType, fieldIdx, fieldKind, dim, dlt);
		case SparseTensorFieldKind::IdxMemRef:
		return callback(idxMemType, fieldIdx, fieldKind, dim, dlt);
		case SparseTensorFieldKind::ValMemRef:
		return callback(valMemType, fieldIdx, fieldKind, dim, dlt);
		};
		});
		}

		unsigned sparse_tensor::getNumFieldsFromEncoding(SparseTensorEncodingAttr enc) {
		unsigned numFields = 0;
		foreachFieldInSparseTensor(enc,
		[&numFields](unsigned, SparseTensorFieldKind,
		unsigned, DimLevelType) -> bool {
		numFields++;
		return true;
		});
		return numFields;
		}

		unsigned
		sparse_tensor::getNumDataFieldsFromEncoding(SparseTensorEncodingAttr enc) {
		unsigned numFields = 0; // one value memref
		foreachFieldInSparseTensor(enc,
		[&numFields](unsigned fidx, SparseTensorFieldKind,
		unsigned, DimLevelType) -> bool {
		if (fidx >= dataFieldIdx)
		numFields++;
		return true;
		});
		assert(numFields == getNumFieldsFromEncoding(enc) - dataFieldIdx);
		return numFields;
		}
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Sparse tensor loop emitter class implementations		// Sparse tensor loop emitter class implementations
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

SparseTensorLoopEmitter::SparseTensorLoopEmitter(ValueRange tensors,		SparseTensorLoopEmitter::SparseTensorLoopEmitter(ValueRange tensors,
StringAttr loopTag,		StringAttr loopTag,
bool hasOutput,		bool hasOutput,
bool isSparseOut,		bool isSparseOut,
▲ Show 20 Lines • Show All 1,069 Lines • Show Last 20 Lines

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorCodegen.cpp

Show All 24 Lines
#include "mlir/Dialect/SparseTensor/Transforms/Passes.h"		#include "mlir/Dialect/SparseTensor/Transforms/Passes.h"
#include "mlir/Dialect/Tensor/IR/Tensor.h"		#include "mlir/Dialect/Tensor/IR/Tensor.h"
#include "mlir/Transforms/DialectConversion.h"		#include "mlir/Transforms/DialectConversion.h"

using namespace mlir;		using namespace mlir;
using namespace mlir::sparse_tensor;		using namespace mlir::sparse_tensor;

namespace {		namespace {

static constexpr uint64_t dimSizesIdx = 0;
static constexpr uint64_t memSizesIdx = 1;
static constexpr uint64_t fieldsIdx = 2;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Helper methods.		// Helper methods.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Returns the "tuple" value of the adapted tensor.		/// Returns the "tuple" value of the adapted tensor.
static UnrealizedConversionCastOp getTuple(Value tensor) {		static UnrealizedConversionCastOp getTuple(Value tensor) {
return llvm::cast<UnrealizedConversionCastOp>(tensor.getDefiningOp());		return llvm::cast<UnrealizedConversionCastOp>(tensor.getDefiningOp());
}		}

		static SparseTensorDescriptor getDescriptorFromTensorTuple(Value tensor) {
		auto tuple = getTuple(tensor);
		return SparseTensorDescriptor(tuple.getResultTypes()[0], tuple.getInputs());
		}

		static MutSparseTensorDescriptor
		getMutDescriptorFromTensorTuple(Value tensor, SmallVectorImpl<Value> &fields) {
		auto tuple = getTuple(tensor);
		fields.assign(tuple.getInputs().begin(), tuple.getInputs().end());
		return MutSparseTensorDescriptor(tuple.getResultTypes()[0], fields);
		}

/// Packs the given values as a "tuple" value.		/// Packs the given values as a "tuple" value.
static Value genTuple(OpBuilder &builder, Location loc, Type tp,		static Value genTuple(OpBuilder &builder, Location loc, Type tp,
ValueRange values) {		ValueRange values) {
return builder.create<UnrealizedConversionCastOp>(loc, TypeRange(tp), values)		return builder.create<UnrealizedConversionCastOp>(loc, TypeRange(tp), values)
.getResult(0);		.getResult(0);
}		}

		static Value genTuple(OpBuilder &builder, Location loc,
		SparseTensorDescriptor desc) {
		return builder
		.create<UnrealizedConversionCastOp>(loc, desc.getTensorType(),
		desc.getFields())
		.getResult(0);
		}

/// Flatten a list of operands that may contain sparse tensors.		/// Flatten a list of operands that may contain sparse tensors.
static void flattenOperands(ValueRange operands,		static void flattenOperands(ValueRange operands,
SmallVectorImpl<Value> &flattened) {		SmallVectorImpl<Value> &flattened) {
// In case of		// In case of
// sparse_tensor, c, sparse_tensor		// sparse_tensor, c, sparse_tensor
// ==>		// ==>
// memref ..., c, memref ...		// memref ..., c, memref ...
for (auto operand : operands) {		for (auto operand : operands) {
Show All 29 Lines	static void genStore(OpBuilder &builder, Location loc, Value val, Value mem,
idx = toType(builder, loc, idx, builder.getIndexType());		idx = toType(builder, loc, idx, builder.getIndexType());
val = toType(builder, loc, val,		val = toType(builder, loc, val,
mem.getType().cast<ShapedType>().getElementType());		mem.getType().cast<ShapedType>().getElementType());
builder.create<memref::StoreOp>(loc, val, mem, idx);		builder.create<memref::StoreOp>(loc, val, mem, idx);
}		}

/// Creates a straightforward counting for-loop.		/// Creates a straightforward counting for-loop.
static scf::ForOp createFor(OpBuilder &builder, Location loc, Value upper,		static scf::ForOp createFor(OpBuilder &builder, Location loc, Value upper,
SmallVectorImpl<Value> &fields,		MutableArrayRef<Value> fields,
Value lower = Value()) {		Value lower = Value()) {
Type indexType = builder.getIndexType();		Type indexType = builder.getIndexType();
if (!lower)		if (!lower)
lower = constantZero(builder, loc, indexType);		lower = constantZero(builder, loc, indexType);
Value one = constantOne(builder, loc, indexType);		Value one = constantOne(builder, loc, indexType);
scf::ForOp forOp = builder.create<scf::ForOp>(loc, lower, upper, one, fields);		scf::ForOp forOp = builder.create<scf::ForOp>(loc, lower, upper, one, fields);
for (unsigned i = 0, e = fields.size(); i < e; i++)		for (unsigned i = 0, e = fields.size(); i < e; i++)
fields[i] = forOp.getRegionIterArg(i);		fields[i] = forOp.getRegionIterArg(i);
builder.setInsertionPointToStart(forOp.getBody());		builder.setInsertionPointToStart(forOp.getBody());
return forOp;		return forOp;
}		}

/// Gets the dimension size for the given sparse tensor at the given		/// Gets the dimension size for the given sparse tensor at the given
/// original dimension 'dim'. Returns None if no sparse encoding is		/// original dimension 'dim'. Returns None if no sparse encoding is
/// attached to the given tensor type.		/// attached to the given tensor type.
static Optional<Value> sizeFromTensorAtDim(OpBuilder &builder, Location loc,		static Optional<Value> sizeFromTensorAtDim(OpBuilder &builder, Location loc,
RankedTensorType tensorTp,		SparseTensorDescriptor desc,
Value adaptedValue, unsigned dim) {		unsigned dim) {
auto enc = getSparseTensorEncoding(tensorTp);		RankedTensorType rtp = desc.getTensorType();
if (!enc)
return llvm::None;

// Access into static dimension can query original type directly.		// Access into static dimension can query original type directly.
// Note that this is typically already done by DimOp's folding.		// Note that this is typically already done by DimOp's folding.
auto shape = tensorTp.getShape();		auto shape = rtp.getShape();
if (!ShapedType::isDynamic(shape[dim]))		if (!ShapedType::isDynamic(shape[dim]))
return constantIndex(builder, loc, shape[dim]);		return constantIndex(builder, loc, shape[dim]);

// Any other query can consult the dimSizes array at field DimSizesIdx,		// Any other query can consult the dimSizes array at field DimSizesIdx,
// accounting for the reordering applied to the sparse storage.		// accounting for the reordering applied to the sparse storage.
auto tuple = getTuple(adaptedValue);		Value idx = constantIndex(builder, loc, toStoredDim(rtp, dim));
Value idx = constantIndex(builder, loc, toStoredDim(tensorTp, dim));		return builder.create<memref::LoadOp>(loc, desc.getDimSizesMemRef(), idx)
return builder
.create<memref::LoadOp>(loc, tuple.getInputs()[dimSizesIdx], idx)
.getResult();		.getResult();
}		}

// Gets the dimension size at the given stored dimension 'd', either as a		// Gets the dimension size at the given stored dimension 'd', either as a
// constant for a static size, or otherwise dynamically through memSizes.		// constant for a static size, or otherwise dynamically through memSizes.
Value sizeAtStoredDim(OpBuilder &builder, Location loc, RankedTensorType rtp,		Value sizeAtStoredDim(OpBuilder &builder, Location loc,
SmallVectorImpl<Value> &fields, unsigned d) {		SparseTensorDescriptor desc, unsigned d) {
		RankedTensorType rtp = desc.getTensorType();
unsigned dim = toOrigDim(rtp, d);		unsigned dim = toOrigDim(rtp, d);
auto shape = rtp.getShape();		auto shape = rtp.getShape();
if (!ShapedType::isDynamic(shape[dim]))		if (!ShapedType::isDynamic(shape[dim]))
return constantIndex(builder, loc, shape[dim]);		return constantIndex(builder, loc, shape[dim]);
return genLoad(builder, loc, fields[dimSizesIdx],
constantIndex(builder, loc, d));
}

/// Translates field index to memSizes index.		return genLoad(builder, loc, desc.getDimSizesMemRef(),
static unsigned getMemSizesIndex(unsigned field) {		constantIndex(builder, loc, d));
assert(fieldsIdx <= field);
return field - fieldsIdx;
}		}

/// Creates a pushback op for given field and updates the fields array
/// accordingly. This operation also updates the memSizes contents.
static void createPushback(OpBuilder &builder, Location loc,		static void createPushback(OpBuilder &builder, Location loc,
SmallVectorImpl<Value> &fields, unsigned field,		MutSparseTensorDescriptor desc, unsigned fidx,
Value value, Value repeat = Value()) {		Value value, Value repeat = Value()) {
assert(fieldsIdx <= field && field < fields.size());		Type etp = desc.getElementType(fidx);
Type etp = fields[field].getType().cast<ShapedType>().getElementType();		Value field = desc.getField(fidx);
fields[field] = builder.create<PushBackOp>(		Value newField = builder.create<PushBackOp>(
loc, fields[field].getType(), fields[memSizesIdx], fields[field],		loc, field.getType(), desc.getMemSizesMemRef(), field,
		PeimingAuthorUnsubmitted Done Reply Inline Actions A better way to create PushBackOp can be `PushBackOp(builder, loc, descriptor, fidx, value)`, which is more clear and can be read as "push a value into the sparse tensor descriptor at the given field index). But it would require us to put `SparseTensorDescriptor` to a publicly available place (and I am not sure whether it is wanted). Peiming: A better way to create PushBackOp can be `PushBackOp(builder, loc, descriptor, fidx, value)`…
toType(builder, loc, value, etp), APInt(64, getMemSizesIndex(field)),		toType(builder, loc, value, etp), APInt(64, getFieldMemSizesIndex(fidx)),
repeat);		repeat);
}		desc.setField(fidx, newField);

/// Returns field index of sparse tensor type for pointers/indices, when set.
static unsigned getFieldIndex(Type type, unsigned ptrDim, unsigned idxDim) {
assert(getSparseTensorEncoding(type));
RankedTensorType rType = type.cast<RankedTensorType>();
unsigned field = fieldsIdx; // start past header
for (unsigned r = 0, rank = rType.getShape().size(); r < rank; r++) {
if (isCompressedDim(rType, r)) {
if (r == ptrDim)
return field;
field++;
if (r == idxDim)
return field;
field++;
} else if (isSingletonDim(rType, r)) {
if (r == idxDim)
return field;
field++;
} else {
assert(isDenseDim(rType, r)); // no fields
}
}
assert(ptrDim == -1u && idxDim == -1u);
return field + 1; // return values field index
}		}

/// Maps a sparse tensor type to the appropriate compounded buffers.		/// Maps a sparse tensor type to the appropriate compounded buffers.
static Optional<LogicalResult>		static Optional<LogicalResult>
convertSparseTensorType(Type type, SmallVectorImpl<Type> &fields) {		convertSparseTensorType(Type type, SmallVectorImpl<Type> &fields) {
auto enc = getSparseTensorEncoding(type);		auto enc = getSparseTensorEncoding(type);
if (!enc)		if (!enc)
return llvm::None;		return llvm::None;
// Construct the basic types.
auto *context = type.getContext();
unsigned idxWidth = enc.getIndexBitWidth();
unsigned ptrWidth = enc.getPointerBitWidth();
RankedTensorType rType = type.cast<RankedTensorType>();		RankedTensorType rType = type.cast<RankedTensorType>();
Type indexType = IndexType::get(context);		foreachFieldAndTypeInSparseTensor(
Type idxType = idxWidth ? IntegerType::get(context, idxWidth) : indexType;		rType,
Type ptrType = ptrWidth ? IntegerType::get(context, ptrWidth) : indexType;		[&fields](Type fieldType, unsigned fieldIdx,
Type eltType = rType.getElementType();		SparseTensorFieldKind /fieldKind/, unsigned /dim/,
//		DimLevelType /dlt/) -> bool {
// Sparse tensor storage scheme for rank-dimensional tensor is organized		assert(fieldIdx == fields.size());
// as a single compound type with the following fields. Note that every		fields.push_back(fieldType);
// memref with ? size actually behaves as a "vector", i.e. the stored		return true;
// size is the capacity and the used size resides in the memSizes array.		});
//
// struct {
// memref<rank x index> dimSizes ; size in each dimension
// memref<n x index> memSizes ; sizes of ptrs/inds/values
// ; per-dimension d:
// ; if dense:
// <nothing>
// ; if compresed:
// memref<? x ptr> pointers-d ; pointers for sparse dim d
// memref<? x idx> indices-d ; indices for sparse dim d
// ; if singleton:
// memref<? x idx> indices-d ; indices for singleton dim d
// memref<? x eltType> values ; values
// };
//
unsigned rank = rType.getShape().size();
unsigned lastField = getFieldIndex(type, -1u, -1u);
// The dimSizes array and memSizes array.
fields.push_back(MemRefType::get({rank}, indexType));
fields.push_back(MemRefType::get({getMemSizesIndex(lastField)}, indexType));
// Per-dimension storage.
for (unsigned r = 0; r < rank; r++) {
// Dimension level types apply in order to the reordered dimension.
// As a result, the compound type can be constructed directly in the given
// order. Clients of this type know what field is what from the sparse
// tensor type.
if (isCompressedDim(rType, r)) {
fields.push_back(MemRefType::get({ShapedType::kDynamic}, ptrType));
fields.push_back(MemRefType::get({ShapedType::kDynamic}, idxType));
} else if (isSingletonDim(rType, r)) {
fields.push_back(MemRefType::get({ShapedType::kDynamic}, idxType));
} else {
assert(isDenseDim(rType, r)); // no fields
}
}
// The values array.
fields.push_back(MemRefType::get({ShapedType::kDynamic}, eltType));
assert(fields.size() == lastField);
return success();		return success();
}		}

/// Generates code that allocates a sparse storage scheme for given rank.		/// Generates code that allocates a sparse storage scheme for given rank.
static void allocSchemeForRank(OpBuilder &builder, Location loc,		static void allocSchemeForRank(OpBuilder &builder, Location loc,
RankedTensorType rtp,		MutSparseTensorDescriptor desc, unsigned r0) {
SmallVectorImpl<Value> &fields, unsigned field,		RankedTensorType rtp = desc.getTensorType();
unsigned r0) {
unsigned rank = rtp.getShape().size();		unsigned rank = rtp.getShape().size();
Value linear = constantIndex(builder, loc, 1);		Value linear = constantIndex(builder, loc, 1);
for (unsigned r = r0; r < rank; r++) {		for (unsigned r = r0; r < rank; r++) {
if (isCompressedDim(rtp, r)) {		if (isCompressedDim(rtp, r)) {
// Append linear x pointers, initialized to zero. Since each compressed		// Append linear x pointers, initialized to zero. Since each compressed
// dimension initially already has a single zero entry, this maintains		// dimension initially already has a single zero entry, this maintains
// the desired "linear + 1" length property at all times.		// the desired "linear + 1" length property at all times.
unsigned ptrWidth = getSparseTensorEncoding(rtp).getPointerBitWidth();		Type ptrType = desc.getPtrElementType();
Type indexType = builder.getIndexType();
Type ptrType = ptrWidth ? builder.getIntegerType(ptrWidth) : indexType;
Value ptrZero = constantZero(builder, loc, ptrType);		Value ptrZero = constantZero(builder, loc, ptrType);
createPushback(builder, loc, fields, field, ptrZero, linear);		unsigned fidx = desc.getPtrMemRefIndex(r);
		createPushback(builder, loc, desc, fidx, ptrZero, linear);
return;		return;
}		}
if (isSingletonDim(rtp, r)) {		if (isSingletonDim(rtp, r)) {
return; // nothing to do		return; // nothing to do
} // Keep compounding the size, but nothing needs to be initialized		} // Keep compounding the size, but nothing needs to be initialized
// at this level. We will eventually reach a compressed level or		// at this level. We will eventually reach a compressed level or
// otherwise the values array for the from-here "all-dense" case.		// otherwise the values array for the from-here "all-dense" case.
assert(isDenseDim(rtp, r));		assert(isDenseDim(rtp, r));
Value size = sizeAtStoredDim(builder, loc, rtp, fields, r);		Value size = sizeAtStoredDim(builder, loc, desc, r);
linear = builder.create<arith::MulIOp>(loc, linear, size);		linear = builder.create<arith::MulIOp>(loc, linear, size);
}		}
// Reached values array so prepare for an insertion.		// Reached values array so prepare for an insertion.
Value valZero = constantZero(builder, loc, rtp.getElementType());		Value valZero = constantZero(builder, loc, rtp.getElementType());
createPushback(builder, loc, fields, field, valZero, linear);		createPushback(builder, loc, desc, desc.getValMemRefIndex(), valZero, linear);
assert(fields.size() == ++field);
}		}

/// Creates allocation operation.		/// Creates allocation operation.
static Value createAllocation(OpBuilder &builder, Location loc, Type type,		static Value createAllocation(OpBuilder &builder, Location loc,
Value sz, bool enableInit) {		MemRefType memRefType, Value sz,
auto memType = MemRefType::get({ShapedType::kDynamic}, type);		bool enableInit) {
Value buffer = builder.create<memref::AllocOp>(loc, memType, sz);		Value buffer = builder.create<memref::AllocOp>(loc, memRefType, sz);
		Type elemType = memRefType.getElementType();
if (enableInit) {		if (enableInit) {
Value fillValue =		Value fillValue = builder.create<arith::ConstantOp>(
builder.create<arith::ConstantOp>(loc, type, builder.getZeroAttr(type));		loc, elemType, builder.getZeroAttr(elemType));
builder.create<linalg::FillOp>(loc, fillValue, buffer);		builder.create<linalg::FillOp>(loc, fillValue, buffer);
}		}
return buffer;		return buffer;
}		}

/// Creates allocation for each field in sparse tensor type. Note that		/// Creates allocation for each field in sparse tensor type. Note that
/// for all dynamic memrefs, the memory size is really the capacity of		/// for all dynamic memrefs, the memory size is really the capacity of
/// the "vector", while the actual size resides in the sizes array.		/// the "vector", while the actual size resides in the sizes array.
///		///
/// TODO: for efficiency, we will need heuristis to make educated guesses		/// TODO: for efficiency, we will need heuristis to make educated guesses
/// on the required capacities (see heuristic variable).		/// on the required capacities (see heuristic variable).
///		///
static void createAllocFields(OpBuilder &builder, Location loc, Type type,		static void createAllocFields(OpBuilder &builder, Location loc, Type type,
ValueRange dynSizes, bool enableInit,		ValueRange dynSizes, bool enableInit,
SmallVectorImpl<Value> &fields) {		SmallVectorImpl<Value> &fields) {
auto enc = getSparseTensorEncoding(type);
assert(enc);
// Construct the basic types.
unsigned idxWidth = enc.getIndexBitWidth();
unsigned ptrWidth = enc.getPointerBitWidth();
RankedTensorType rtp = type.cast<RankedTensorType>();		RankedTensorType rtp = type.cast<RankedTensorType>();
Type indexType = builder.getIndexType();
Type idxType = idxWidth ? builder.getIntegerType(idxWidth) : indexType;
Type ptrType = ptrWidth ? builder.getIntegerType(ptrWidth) : indexType;
Type eltType = rtp.getElementType();
auto shape = rtp.getShape();
unsigned rank = shape.size();
Value heuristic = constantIndex(builder, loc, 16);		Value heuristic = constantIndex(builder, loc, 16);

		foreachFieldAndTypeInSparseTensor(
		aartbikUnsubmitted Not Done Reply Inline Actions note that this works for now, but we actually planned to implement a heuristic here, which will need to scan the ranks + level types again aartbik: note that this works for now, but we actually planned to implement a heuristic here, which will…
		PeimingAuthorUnsubmitted Done Reply Inline Actions Then, you can have another foreach before this to compute the heuristic. Peiming: Then, you can have another foreach before this to compute the heuristic.
		rtp,
		[&](Type fType, unsigned fIdx, SparseTensorFieldKind fKind,
		unsigned /dim/, DimLevelType /dlt/) -> bool {
		assert(fields.size() == fIdx);
		auto memRefTp = fType.cast<MemRefType>();
		Value field;
		switch (fKind) {
		case SparseTensorFieldKind::DimSizes:
		case SparseTensorFieldKind::MemSizes:
		field = builder.create<memref::AllocOp>(loc, memRefTp);
		break;
		case SparseTensorFieldKind::PtrMemRef:
		case SparseTensorFieldKind::IdxMemRef:
		case SparseTensorFieldKind::ValMemRef:
		field =
		createAllocation(builder, loc, memRefTp, heuristic, enableInit);
		break;
		}
		assert(field);
		fields.push_back(field);
		// Returns true to ontinue the iteration.
		return true;
		aartbikUnsubmitted Done Reply Inline Actions continue aartbik: continue
		});

		MutSparseTensorDescriptor desc(rtp, fields);

// Build original sizes.		// Build original sizes.
SmallVector<Value> sizes;		SmallVector<Value> sizes;
		auto shape = rtp.getShape();
		unsigned rank = shape.size();
for (unsigned r = 0, o = 0; r < rank; r++) {		for (unsigned r = 0, o = 0; r < rank; r++) {
if (ShapedType::isDynamic(shape[r]))		if (ShapedType::isDynamic(shape[r]))
sizes.push_back(dynSizes[o++]);		sizes.push_back(dynSizes[o++]);
else		else
sizes.push_back(constantIndex(builder, loc, shape[r]));		sizes.push_back(constantIndex(builder, loc, shape[r]));
}		}
// The dimSizes array and memSizes array.
unsigned lastField = getFieldIndex(type, -1u, -1u);
Value dimSizes =
builder.create<memref::AllocOp>(loc, MemRefType::get({rank}, indexType));
Value memSizes = builder.create<memref::AllocOp>(
loc, MemRefType::get({getMemSizesIndex(lastField)}, indexType));
fields.push_back(dimSizes);
fields.push_back(memSizes);
// Per-dimension storage.
for (unsigned r = 0; r < rank; r++) {
if (isCompressedDim(rtp, r)) {
fields.push_back(
createAllocation(builder, loc, ptrType, heuristic, enableInit));
fields.push_back(
createAllocation(builder, loc, idxType, heuristic, enableInit));
} else if (isSingletonDim(rtp, r)) {
fields.push_back(
createAllocation(builder, loc, idxType, heuristic, enableInit));
} else {
assert(isDenseDim(rtp, r)); // no fields
}
}
// The values array.
fields.push_back(
createAllocation(builder, loc, eltType, heuristic, enableInit));
assert(fields.size() == lastField);
// Initialize the storage scheme to an empty tensor. Initialized memSizes		// Initialize the storage scheme to an empty tensor. Initialized memSizes
// to all zeros, sets the dimSizes to known values and gives all pointer		// to all zeros, sets the dimSizes to known values and gives all pointer
// fields an initial zero entry, so that it is easier to maintain the		// fields an initial zero entry, so that it is easier to maintain the
// "linear + 1" length property.		// "linear + 1" length property.
builder.create<linalg::FillOp>(		builder.create<linalg::FillOp>(
loc, ValueRange{constantZero(builder, loc, indexType)},		loc, constantZero(builder, loc, builder.getIndexType()),
ValueRange{memSizes}); // zero memSizes		desc.getMemSizesMemRef()); // zero memSizes
Value ptrZero = constantZero(builder, loc, ptrType);
for (unsigned r = 0, field = fieldsIdx; r < rank; r++) {		Value ptrZero = constantZero(builder, loc, desc.getPtrElementType());
		for (unsigned r = 0; r < rank; r++) {
unsigned ro = toOrigDim(rtp, r);		unsigned ro = toOrigDim(rtp, r);
genStore(builder, loc, sizes[ro], dimSizes, constantIndex(builder, loc, r));		// Fills dim sizes array.
		genStore(builder, loc, sizes[ro], desc.getDimSizesMemRef(),
		constantIndex(builder, loc, r));

		// Pushes a leading zero to pointers memref.
if (isCompressedDim(rtp, r)) {		if (isCompressedDim(rtp, r)) {
createPushback(builder, loc, fields, field, ptrZero);		unsigned fidx =
field += 2;		desc.getDataFieldIndex(r, SparseTensorFieldKind::PtrMemRef);
} else if (isSingletonDim(rtp, r)) {		createPushback(builder, loc, desc, fidx, ptrZero);
field += 1;
}		}
}		}
allocSchemeForRank(builder, loc, rtp, fields, fieldsIdx, /rank=/0);		allocSchemeForRank(builder, loc, desc, /rank=/0);
}		}

/// Helper method that generates block specific to compressed case:		/// Helper method that generates block specific to compressed case:
///		///
/// plo = pointers[d][pos[d-1]]		/// plo = pointers[d][pos[d-1]]
/// phi = pointers[d][pos[d-1]+1]		/// phi = pointers[d][pos[d-1]+1]
/// msz = indices[d].size()		/// msz = indices[d].size()
/// if (plo < phi) {		/// if (plo < phi) {
/// present = indices[d][phi-1] == i[d]		/// present = indices[d][phi-1] == i[d]
/// } else { // first insertion		/// } else { // first insertion
/// present = false		/// present = false
/// pointers[d][pos[d-1]] = msz		/// pointers[d][pos[d-1]] = msz
/// }		/// }
/// if (present) { // index already present		/// if (present) { // index already present
/// next = phi-1		/// next = phi-1
/// } else {		/// } else {
/// indices[d].push_back(i[d])		/// indices[d].push_back(i[d])
/// pointers[d][pos[d-1]+1] = msz+1		/// pointers[d][pos[d-1]+1] = msz+1
/// next = msz		/// next = msz
/// <prepare dimension d + 1>		/// <prepare dimension d + 1>
/// }		/// }
/// pos[d] = next		/// pos[d] = next
static Value genCompressed(OpBuilder &builder, Location loc,		static Value genCompressed(OpBuilder &builder, Location loc,
RankedTensorType rtp, SmallVectorImpl<Value> &fields,		MutSparseTensorDescriptor desc,
SmallVectorImpl<Value> &indices, Value value,		SmallVectorImpl<Value> &indices, Value value,
Value pos, unsigned field, unsigned d) {		Value pos, unsigned d) {
		RankedTensorType rtp = desc.getTensorType();
unsigned rank = rtp.getShape().size();		unsigned rank = rtp.getShape().size();
SmallVector<Type> types;		SmallVector<Type> types;
Type indexType = builder.getIndexType();		Type indexType = builder.getIndexType();
Type boolType = builder.getIntegerType(1);		Type boolType = builder.getIntegerType(1);
		unsigned idxIndex = desc.getIdxMemRefIndex(d);
		unsigned ptrIndex = desc.getPtrMemRefIndex(d);
Value one = constantIndex(builder, loc, 1);		Value one = constantIndex(builder, loc, 1);
Value pp1 = builder.create<arith::AddIOp>(loc, pos, one);		Value pp1 = builder.create<arith::AddIOp>(loc, pos, one);
Value plo = genLoad(builder, loc, fields[field], pos);		Value plo = genLoad(builder, loc, desc.getField(ptrIndex), pos);
Value phi = genLoad(builder, loc, fields[field], pp1);		Value phi = genLoad(builder, loc, desc.getField(ptrIndex), pp1);
Value psz = constantIndex(builder, loc, getMemSizesIndex(field + 1));		Value psz = constantIndex(builder, loc, getFieldMemSizesIndex(idxIndex));
Value msz = genLoad(builder, loc, fields[memSizesIdx], psz);		Value msz = genLoad(builder, loc, desc.getMemSizesMemRef(), psz);
Value phim1 = builder.create<arith::SubIOp>(		Value phim1 = builder.create<arith::SubIOp>(
loc, toType(builder, loc, phi, indexType), one);		loc, toType(builder, loc, phi, indexType), one);
// Conditional expression.		// Conditional expression.
Value lt =		Value lt =
builder.create<arith::CmpIOp>(loc, arith::CmpIPredicate::ult, plo, phi);		builder.create<arith::CmpIOp>(loc, arith::CmpIPredicate::ult, plo, phi);
types.push_back(boolType);		types.push_back(boolType);
scf::IfOp ifOp1 = builder.create<scf::IfOp>(loc, types, lt, /else/ true);		scf::IfOp ifOp1 = builder.create<scf::IfOp>(loc, types, lt, /else/ true);
types.pop_back();		types.pop_back();
builder.setInsertionPointToStart(&ifOp1.getThenRegion().front());		builder.setInsertionPointToStart(&ifOp1.getThenRegion().front());
Value crd = genLoad(builder, loc, fields[field + 1], phim1);		Value crd = genLoad(builder, loc, desc.getField(idxIndex), phim1);
Value eq = builder.create<arith::CmpIOp>(loc, arith::CmpIPredicate::eq,		Value eq = builder.create<arith::CmpIOp>(loc, arith::CmpIPredicate::eq,
toType(builder, loc, crd, indexType),		toType(builder, loc, crd, indexType),
indices[d]);		indices[d]);
builder.create<scf::YieldOp>(loc, eq);		builder.create<scf::YieldOp>(loc, eq);
builder.setInsertionPointToStart(&ifOp1.getElseRegion().front());		builder.setInsertionPointToStart(&ifOp1.getElseRegion().front());
if (d > 0)		if (d > 0)
genStore(builder, loc, msz, fields[field], pos);		genStore(builder, loc, msz, desc.getField(ptrIndex), pos);
builder.create<scf::YieldOp>(loc, constantI1(builder, loc, false));		builder.create<scf::YieldOp>(loc, constantI1(builder, loc, false));
builder.setInsertionPointAfter(ifOp1);		builder.setInsertionPointAfter(ifOp1);
Value p = ifOp1.getResult(0);		Value p = ifOp1.getResult(0);
// If present construct. Note that for a non-unique dimension level, we simply		// If present construct. Note that for a non-unique dimension level, we
// set the condition to false and rely on CSE/DCE to clean up the IR.		// simply set the condition to false and rely on CSE/DCE to clean up the IR.
//		//
// TODO: generate less temporary IR?		// TODO: generate less temporary IR?
//		//
for (unsigned i = 0, e = fields.size(); i < e; i++)		for (unsigned i = 0, e = desc.getNumFields(); i < e; i++)
types.push_back(fields[i].getType());		types.push_back(desc.getField(i).getType());
types.push_back(indexType);		types.push_back(indexType);
if (!isUniqueDim(rtp, d))		if (!isUniqueDim(rtp, d))
p = constantI1(builder, loc, false);		p = constantI1(builder, loc, false);
scf::IfOp ifOp2 = builder.create<scf::IfOp>(loc, types, p, /else/ true);		scf::IfOp ifOp2 = builder.create<scf::IfOp>(loc, types, p, /else/ true);
// If present (fields unaffected, update next to phim1).		// If present (fields unaffected, update next to phim1).
builder.setInsertionPointToStart(&ifOp2.getThenRegion().front());		builder.setInsertionPointToStart(&ifOp2.getThenRegion().front());
fields.push_back(phim1);
builder.create<scf::YieldOp>(loc, fields);		// FIXME: This does not looks like a clean way, but probably the most
fields.pop_back();		// efficient way.
		desc.getFields().push_back(phim1);
		aartbikUnsubmitted Not Done Reply Inline Actions yeah, agreed, this made more sense in the original but the alternative is to copy and create a fully new array.... aartbik: yeah, agreed, this made more sense in the original but the alternative is to copy and create a…
		builder.create<scf::YieldOp>(loc, desc.getFields());
		desc.getFields().pop_back();

// If !present (changes fields, update next).		// If !present (changes fields, update next).
builder.setInsertionPointToStart(&ifOp2.getElseRegion().front());		builder.setInsertionPointToStart(&ifOp2.getElseRegion().front());
Value mszp1 = builder.create<arith::AddIOp>(loc, msz, one);		Value mszp1 = builder.create<arith::AddIOp>(loc, msz, one);
genStore(builder, loc, mszp1, fields[field], pp1);		genStore(builder, loc, mszp1, desc.getField(ptrIndex), pp1);
createPushback(builder, loc, fields, field + 1, indices[d]);		createPushback(builder, loc, desc, idxIndex, indices[d]);
// Prepare the next dimension "as needed".		// Prepare the next dimension "as needed".
if ((d + 1) < rank)		if ((d + 1) < rank)
allocSchemeForRank(builder, loc, rtp, fields, field + 2, d + 1);		allocSchemeForRank(builder, loc, desc, d + 1);
fields.push_back(msz);
builder.create<scf::YieldOp>(loc, fields);		desc.getFields().push_back(msz);
fields.pop_back();		builder.create<scf::YieldOp>(loc, desc.getFields());
		desc.getFields().pop_back();

// Update fields and return next pos.		// Update fields and return next pos.
builder.setInsertionPointAfter(ifOp2);		builder.setInsertionPointAfter(ifOp2);
unsigned o = 0;		unsigned o = 0;
for (unsigned i = 0, e = fields.size(); i < e; i++)		for (unsigned i = 0, e = desc.getNumFields(); i < e; i++)
fields[i] = ifOp2.getResult(o++);		desc.setField(i, ifOp2.getResult(o++));
return ifOp2.getResult(o);		return ifOp2.getResult(o);
}		}

/// Generates code along an insertion path without the need for a "cursor".		/// Generates code along an insertion path without the need for a "cursor".
/// This current insertion strategy comes at the expense of some testing		/// This current insertion strategy comes at the expense of some testing
/// overhead for each insertion. The strategy will be optimized later for		/// overhead for each insertion. The strategy will be optimized later for
/// common insertion patterns. The current insertion strategy also assumes		/// common insertion patterns. The current insertion strategy also assumes
/// insertions occur in "a reasonable order" that enables building the		/// insertions occur in "a reasonable order" that enables building the
/// storage scheme in an appending/inserting kind of fashion (i.e. no		/// storage scheme in an appending/inserting kind of fashion (i.e. no
/// in-between insertions that need data movement). The implementation		/// in-between insertions that need data movement). The implementation
/// relies on CSE/DCE to clean up all bookkeeping that is not needed.		/// relies on CSE/DCE to clean up all bookkeeping that is not needed.
///		///
/// TODO: better unord/not-unique; also generalize, optimize, specialize!		/// TODO: better unord/not-unique; also generalize, optimize, specialize!
///		///
static void genInsert(OpBuilder &builder, Location loc, RankedTensorType rtp,		static void genInsert(OpBuilder &builder, Location loc,
SmallVectorImpl<Value> &fields,		MutSparseTensorDescriptor desc,
SmallVectorImpl<Value> &indices, Value value) {		SmallVectorImpl<Value> &indices, Value value) {
		RankedTensorType rtp = desc.getTensorType();
unsigned rank = rtp.getShape().size();		unsigned rank = rtp.getShape().size();
assert(rank == indices.size());		assert(rank == indices.size());
unsigned field = fieldsIdx; // start past header
Value pos = constantZero(builder, loc, builder.getIndexType());		Value pos = constantZero(builder, loc, builder.getIndexType());
// Generate code for every dimension.		// Generate code for every dimension.
for (unsigned d = 0; d < rank; d++) {		for (unsigned d = 0; d < rank; d++) {
if (isCompressedDim(rtp, d)) {		if (isCompressedDim(rtp, d)) {
// Create:		// Create:
// if (!present) {		// if (!present) {
// indices[d].push_back(i[d])		// indices[d].push_back(i[d])
// <update pointers and prepare dimension d + 1>		// <update pointers and prepare dimension d + 1>
// }		// }
// pos[d] = indices.size() - 1		// pos[d] = indices.size() - 1
// <insert @ pos[d] at next dimension d + 1>		// <insert @ pos[d] at next dimension d + 1>
pos = genCompressed(builder, loc, rtp, fields, indices, value, pos, field,		pos = genCompressed(builder, loc, desc, indices, value, pos, d);
d);
field += 2;
} else if (isSingletonDim(rtp, d)) {		} else if (isSingletonDim(rtp, d)) {
// Create:		// Create:
// indices[d].push_back(i[d])		// indices[d].push_back(i[d])
// pos[d] = pos[d-1]		// pos[d] = pos[d-1]
// <insert @ pos[d] at next dimension d + 1>		// <insert @ pos[d] at next dimension d + 1>
createPushback(builder, loc, fields, field, indices[d]);		unsigned fidx = desc.getIdxMemRefIndex(d);
field += 1;		createPushback(builder, loc, desc, fidx, indices[d]);
} else {		} else {
assert(isDenseDim(rtp, d));		assert(isDenseDim(rtp, d));
// Construct the new position as:		// Construct the new position as:
// pos[d] = size * pos[d-1] + i[d]		// pos[d] = size * pos[d-1] + i[d]
// <insert @ pos[d] at next dimension d + 1>		// <insert @ pos[d] at next dimension d + 1>
Value size = sizeAtStoredDim(builder, loc, rtp, fields, d);		Value size = sizeAtStoredDim(builder, loc, desc, d);
Value mult = builder.create<arith::MulIOp>(loc, size, pos);		Value mult = builder.create<arith::MulIOp>(loc, size, pos);
pos = builder.create<arith::AddIOp>(loc, mult, indices[d]);		pos = builder.create<arith::AddIOp>(loc, mult, indices[d]);
}		}
}		}
// Reached the actual value append/insert.		// Reached the actual value append/insert.
if (!isDenseDim(rtp, rank - 1))		unsigned valIdx = desc.getValMemRefIndex();
createPushback(builder, loc, fields, field++, value);		if (!isDenseDim(rtp, rank - 1)) {
else		createPushback(builder, loc, desc, valIdx, value);
genStore(builder, loc, value, fields[field++], pos);		} else
assert(fields.size() == field);		genStore(builder, loc, value, desc.getValMemRef(), pos);
}		}

/// Generations insertion finalization code.		/// Generations insertion finalization code.
static void genEndInsert(OpBuilder &builder, Location loc, RankedTensorType rtp,		static void genEndInsert(OpBuilder &builder, Location loc,
SmallVectorImpl<Value> &fields) {		MutSparseTensorDescriptor desc) {
		RankedTensorType rtp = desc.getTensorType();
unsigned rank = rtp.getShape().size();		unsigned rank = rtp.getShape().size();
unsigned field = fieldsIdx; // start past header
for (unsigned d = 0; d < rank; d++) {		for (unsigned d = 0; d < rank; d++) {
if (isCompressedDim(rtp, d)) {		if (isCompressedDim(rtp, d)) {
// Compressed dimensions need a pointer cleanup for all entries		// Compressed dimensions need a pointer cleanup for all entries
// that were not visited during the insertion pass.		// that were not visited during the insertion pass.
//		//
// TODO: avoid cleanup and keep compressed scheme consistent at all times?		// TODO: avoid cleanup and keep compressed scheme consistent at all
		// times?
//		//
if (d > 0) {		if (d > 0) {
unsigned ptrWidth = getSparseTensorEncoding(rtp).getPointerBitWidth();		unsigned ptrWidth = getSparseTensorEncoding(rtp).getPointerBitWidth();
Type indexType = builder.getIndexType();		Type indexType = builder.getIndexType();
Type ptrType = ptrWidth ? builder.getIntegerType(ptrWidth) : indexType;		Type ptrType = ptrWidth ? builder.getIntegerType(ptrWidth) : indexType;
Value mz = constantIndex(builder, loc, getMemSizesIndex(field));		Value ptrMemRef = desc.getPtrMemRef(d);
Value hi = genLoad(builder, loc, fields[memSizesIdx], mz);		Value mz = constantIndex(builder, loc, desc.getPtrMemSizesIndex(d));
		Value hi = genLoad(builder, loc, desc.getMemSizesMemRef(), mz);
Value zero = constantIndex(builder, loc, 0);		Value zero = constantIndex(builder, loc, 0);
Value one = constantIndex(builder, loc, 1);		Value one = constantIndex(builder, loc, 1);
// Vector of only one, but needed by createFor's prototype.		// Vector of only one, but needed by createFor's prototype.
SmallVector<Value, 1> inits{genLoad(builder, loc, fields[field], zero)};		SmallVector<Value, 1> inits{genLoad(builder, loc, ptrMemRef, zero)};
scf::ForOp loop = createFor(builder, loc, hi, inits, one);		scf::ForOp loop = createFor(builder, loc, hi, inits, one);
Value i = loop.getInductionVar();		Value i = loop.getInductionVar();
Value oldv = loop.getRegionIterArg(0);		Value oldv = loop.getRegionIterArg(0);
Value newv = genLoad(builder, loc, fields[field], i);		Value newv = genLoad(builder, loc, ptrMemRef, i);
Value ptrZero = constantZero(builder, loc, ptrType);		Value ptrZero = constantZero(builder, loc, ptrType);
Value cond = builder.create<arith::CmpIOp>(		Value cond = builder.create<arith::CmpIOp>(
loc, arith::CmpIPredicate::eq, newv, ptrZero);		loc, arith::CmpIPredicate::eq, newv, ptrZero);
scf::IfOp ifOp = builder.create<scf::IfOp>(loc, TypeRange(ptrType),		scf::IfOp ifOp = builder.create<scf::IfOp>(loc, TypeRange(ptrType),
cond, /else/ true);		cond, /else/ true);
builder.setInsertionPointToStart(&ifOp.getThenRegion().front());		builder.setInsertionPointToStart(&ifOp.getThenRegion().front());
genStore(builder, loc, oldv, fields[field], i);		genStore(builder, loc, oldv, ptrMemRef, i);
builder.create<scf::YieldOp>(loc, oldv);		builder.create<scf::YieldOp>(loc, oldv);
builder.setInsertionPointToStart(&ifOp.getElseRegion().front());		builder.setInsertionPointToStart(&ifOp.getElseRegion().front());
builder.create<scf::YieldOp>(loc, newv);		builder.create<scf::YieldOp>(loc, newv);
builder.setInsertionPointAfter(ifOp);		builder.setInsertionPointAfter(ifOp);
builder.create<scf::YieldOp>(loc, ifOp.getResult(0));		builder.create<scf::YieldOp>(loc, ifOp.getResult(0));
builder.setInsertionPointAfter(loop);		builder.setInsertionPointAfter(loop);
}		}
field += 2;
} else if (isSingletonDim(rtp, d)) {
field++;
} else {		} else {
assert(isDenseDim(rtp, d));		assert(isDenseDim(rtp, d) \|\| isSingletonDim(rtp, d));
}		}
}		}
assert(fields.size() == ++field);
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Codegen rules.		// Codegen rules.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Sparse tensor storage conversion rule for returns.		/// Sparse tensor storage conversion rule for returns.
class SparseReturnConverter : public OpConversionPattern<func::ReturnOp> {		class SparseReturnConverter : public OpConversionPattern<func::ReturnOp> {
▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
/// Sparse codegen rule for dimension accesses.		/// Sparse codegen rule for dimension accesses.
class SparseDimOpConverter : public OpConversionPattern<tensor::DimOp> {		class SparseDimOpConverter : public OpConversionPattern<tensor::DimOp> {
public:		public:
using OpConversionPattern::OpConversionPattern;		using OpConversionPattern::OpConversionPattern;
LogicalResult		LogicalResult
matchAndRewrite(tensor::DimOp op, OpAdaptor adaptor,		matchAndRewrite(tensor::DimOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
Optional<int64_t> index = op.getConstantIndex();		Optional<int64_t> index = op.getConstantIndex();
if (!index)		if (!index \|\| !getSparseTensorEncoding(adaptor.getSource().getType()))
return failure();		return failure();
auto sz =
sizeFromTensorAtDim(rewriter, op.getLoc(),		auto desc = getDescriptorFromTensorTuple(adaptor.getSource());
op.getSource().getType().cast<RankedTensorType>(),		auto sz = sizeFromTensorAtDim(rewriter, op.getLoc(), desc, *index);
adaptor.getSource(), *index);
if (!sz)		if (!sz)
return failure();		return failure();

rewriter.replaceOp(op, *sz);		rewriter.replaceOp(op, *sz);
return success();		return success();
}		}
};		};

▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines

/// Sparse codegen rule for tensor rematerialization.		/// Sparse codegen rule for tensor rematerialization.
class SparseTensorLoadConverter : public OpConversionPattern<LoadOp> {		class SparseTensorLoadConverter : public OpConversionPattern<LoadOp> {
public:		public:
using OpConversionPattern::OpConversionPattern;		using OpConversionPattern::OpConversionPattern;
LogicalResult		LogicalResult
matchAndRewrite(LoadOp op, OpAdaptor adaptor,		matchAndRewrite(LoadOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
RankedTensorType srcType =		// Prepare descriptor.
op.getTensor().getType().cast<RankedTensorType>();		SmallVector<Value> fields;
auto tuple = getTuple(adaptor.getTensor());		auto desc = getMutDescriptorFromTensorTuple(adaptor.getTensor(), fields);
// Prepare fields.
SmallVector<Value> fields(tuple.getInputs());
// Generate optional insertion finalization code.		// Generate optional insertion finalization code.
if (op.getHasInserts())		if (op.getHasInserts())
genEndInsert(rewriter, op.getLoc(), srcType, fields);		genEndInsert(rewriter, op.getLoc(), desc);
// Replace operation with resulting memrefs.		// Replace operation with resulting memrefs.
rewriter.replaceOp(op, genTuple(rewriter, op.getLoc(), srcType, fields));		rewriter.replaceOp(op, genTuple(rewriter, op.getLoc(), desc));
return success();		return success();
}		}
};		};

/// Sparse codegen rule for the expand op.		/// Sparse codegen rule for the expand op.
class SparseExpandConverter : public OpConversionPattern<ExpandOp> {		class SparseExpandConverter : public OpConversionPattern<ExpandOp> {
public:		public:
using OpConversionPattern::OpConversionPattern;		using OpConversionPattern::OpConversionPattern;
LogicalResult		LogicalResult
matchAndRewrite(ExpandOp op, OpAdaptor adaptor,		matchAndRewrite(ExpandOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
		if (!getSparseTensorEncoding(op.getTensor().getType()))
		return failure();
Location loc = op->getLoc();		Location loc = op->getLoc();
		auto desc = getDescriptorFromTensorTuple(adaptor.getTensor());
RankedTensorType srcType =		RankedTensorType srcType =
op.getTensor().getType().cast<RankedTensorType>();		op.getTensor().getType().cast<RankedTensorType>();
Type eltType = srcType.getElementType();		Type eltType = srcType.getElementType();
Type boolType = rewriter.getIntegerType(1);		Type boolType = rewriter.getIntegerType(1);
Type idxType = rewriter.getIndexType();		Type idxType = rewriter.getIndexType();
// All initialization should be done on entry of the loop nest.		// All initialization should be done on entry of the loop nest.
rewriter.setInsertionPointAfter(op.getTensor().getDefiningOp());		rewriter.setInsertionPointAfter(op.getTensor().getDefiningOp());
// Determine the size for access expansion (always the innermost stored		// Determine the size for access expansion (always the innermost stored
// dimension size, translated back to original dimension). Note that we		// dimension size, translated back to original dimension). Note that we
// recursively rewrite the new DimOp on the original tensor.		// recursively rewrite the new DimOp on the original tensor.
unsigned innerDim = toOrigDim(srcType, srcType.getRank() - 1);		unsigned innerDim = toOrigDim(srcType, srcType.getRank() - 1);
auto sz = sizeFromTensorAtDim(rewriter, loc, srcType, adaptor.getTensor(),		auto sz = sizeFromTensorAtDim(rewriter, loc, desc, innerDim);
innerDim);
assert(sz); // This for sure is a sparse tensor		assert(sz); // This for sure is a sparse tensor
// Generate a memref for `sz` elements of type `t`.		// Generate a memref for `sz` elements of type `t`.
auto genAlloc = [&](Type t) {		auto genAlloc = [&](Type t) {
auto memTp = MemRefType::get({ShapedType::kDynamic}, t);		auto memTp = MemRefType::get({ShapedType::kDynamic}, t);
return rewriter.create<memref::AllocOp>(loc, memTp, ValueRange{*sz});		return rewriter.create<memref::AllocOp>(loc, memTp, ValueRange{*sz});
};		};
// Allocate temporary buffers for values/filled-switch and added.		// Allocate temporary buffers for values/filled-switch and added.
// We do not use stack buffers for this, since the expanded size may		// We do not use stack buffers for this, since the expanded size may
Show All 23 Lines
/// Sparse codegen rule for the compress operator.		/// Sparse codegen rule for the compress operator.
class SparseCompressConverter : public OpConversionPattern<CompressOp> {		class SparseCompressConverter : public OpConversionPattern<CompressOp> {
public:		public:
using OpConversionPattern::OpConversionPattern;		using OpConversionPattern::OpConversionPattern;
LogicalResult		LogicalResult
matchAndRewrite(CompressOp op, OpAdaptor adaptor,		matchAndRewrite(CompressOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
Location loc = op->getLoc();		Location loc = op->getLoc();
RankedTensorType dstType =		SmallVector<Value> fields;
op.getTensor().getType().cast<RankedTensorType>();		auto desc = getMutDescriptorFromTensorTuple(adaptor.getTensor(), fields);
Type eltType = dstType.getElementType();
auto tuple = getTuple(adaptor.getTensor());
Value values = adaptor.getValues();		Value values = adaptor.getValues();
Value filled = adaptor.getFilled();		Value filled = adaptor.getFilled();
Value added = adaptor.getAdded();		Value added = adaptor.getAdded();
Value count = adaptor.getCount();		Value count = adaptor.getCount();
// Prepare fields and indices.		RankedTensorType dstType = desc.getTensorType();
SmallVector<Value> fields(tuple.getInputs());		Type eltType = dstType.getElementType();
		// Prepare indices.
SmallVector<Value> indices(adaptor.getIndices());		SmallVector<Value> indices(adaptor.getIndices());
// If the innermost dimension is ordered, we need to sort the indices		// If the innermost dimension is ordered, we need to sort the indices
// in the "added" array prior to applying the compression.		// in the "added" array prior to applying the compression.
unsigned rank = dstType.getShape().size();		unsigned rank = dstType.getShape().size();
if (isOrderedDim(dstType, rank - 1))		if (isOrderedDim(dstType, rank - 1))
rewriter.create<SortOp>(loc, count, ValueRange{added}, ValueRange{});		rewriter.create<SortOp>(loc, count, ValueRange{added}, ValueRange{});
// While performing the insertions, we also need to reset the elements		// While performing the insertions, we also need to reset the elements
// of the values/filled-switch by only iterating over the set elements,		// of the values/filled-switch by only iterating over the set elements,
// to ensure that the runtime complexity remains proportional to the		// to ensure that the runtime complexity remains proportional to the
// sparsity of the expanded access pattern.		// sparsity of the expanded access pattern.
//		//
// Generate		// Generate
// out_memrefs = for (i = 0; i < count; i++)(in_memrefs) {		// out_memrefs = for (i = 0; i < count; i++)(in_memrefs) {
// index = added[i];		// index = added[i];
// value = values[index];		// value = values[index];
// insert({prev_indices, index}, value);		// insert({prev_indices, index}, value);
// new_memrefs = insert(in_memrefs, {prev_indices, index}, value);		// new_memrefs = insert(in_memrefs, {prev_indices, index}, value);
// values[index] = 0;		// values[index] = 0;
// filled[index] = false;		// filled[index] = false;
// yield new_memrefs		// yield new_memrefs
// }		// }
scf::ForOp loop = createFor(rewriter, loc, count, fields);		scf::ForOp loop = createFor(rewriter, loc, count, desc.getFields());
Value i = loop.getInductionVar();		Value i = loop.getInductionVar();
Value index = genLoad(rewriter, loc, added, i);		Value index = genLoad(rewriter, loc, added, i);
Value value = genLoad(rewriter, loc, values, index);		Value value = genLoad(rewriter, loc, values, index);
indices.push_back(index);		indices.push_back(index);
// TODO: faster for subsequent insertions?		// TODO: faster for subsequent insertions?
genInsert(rewriter, loc, dstType, fields, indices, value);		genInsert(rewriter, loc, desc, indices, value);
genStore(rewriter, loc, constantZero(rewriter, loc, eltType), values,		genStore(rewriter, loc, constantZero(rewriter, loc, eltType), values,
index);		index);
genStore(rewriter, loc, constantI1(rewriter, loc, false), filled, index);		genStore(rewriter, loc, constantI1(rewriter, loc, false), filled, index);
rewriter.create<scf::YieldOp>(loc, fields);		rewriter.create<scf::YieldOp>(loc, desc.getFields());
rewriter.setInsertionPointAfter(loop);		rewriter.setInsertionPointAfter(loop);
Value result = genTuple(rewriter, loc, dstType, loop->getResults());		Value result = genTuple(rewriter, loc, dstType, loop->getResults());
// Deallocate the buffers on exit of the full loop nest.		// Deallocate the buffers on exit of the full loop nest.
Operation *parent = getTop(op);		Operation *parent = getTop(op);
rewriter.setInsertionPointAfter(parent);		rewriter.setInsertionPointAfter(parent);
rewriter.create<memref::DeallocOp>(loc, values);		rewriter.create<memref::DeallocOp>(loc, values);
rewriter.create<memref::DeallocOp>(loc, filled);		rewriter.create<memref::DeallocOp>(loc, filled);
rewriter.create<memref::DeallocOp>(loc, added);		rewriter.create<memref::DeallocOp>(loc, added);
// Replace operation with resulting memrefs.		// Replace operation with resulting memrefs.
rewriter.replaceOp(op, result);		rewriter.replaceOp(op, result);
return success();		return success();
}		}
};		};

/// Sparse codegen rule for the insert operator.		/// Sparse codegen rule for the insert operator.
class SparseInsertConverter : public OpConversionPattern<InsertOp> {		class SparseInsertConverter : public OpConversionPattern<InsertOp> {
public:		public:
using OpConversionPattern::OpConversionPattern;		using OpConversionPattern::OpConversionPattern;
LogicalResult		LogicalResult
matchAndRewrite(InsertOp op, OpAdaptor adaptor,		matchAndRewrite(InsertOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
RankedTensorType dstType =		SmallVector<Value> fields;
op.getTensor().getType().cast<RankedTensorType>();		auto desc = getMutDescriptorFromTensorTuple(adaptor.getTensor(), fields);
auto tuple = getTuple(adaptor.getTensor());		// Prepare and indices.
// Prepare fields and indices.
SmallVector<Value> fields(tuple.getInputs());
SmallVector<Value> indices(adaptor.getIndices());		SmallVector<Value> indices(adaptor.getIndices());
// Generate insertion.		// Generate insertion.
Value value = adaptor.getValue();		Value value = adaptor.getValue();
genInsert(rewriter, op->getLoc(), dstType, fields, indices, value);		genInsert(rewriter, op.getLoc(), desc, indices, value);
// Replace operation with resulting memrefs.		// Replace operation with resulting memrefs.
rewriter.replaceOp(op, genTuple(rewriter, op.getLoc(), dstType, fields));		rewriter.replaceOp(op, genTuple(rewriter, op.getLoc(), desc));
return success();		return success();
}		}
};		};

/// Base class for getter-like operations, e.g., to_indices, to_pointers.		/// Base class for getter-like operations, e.g., to_indices, to_pointers.
template <typename SourceOp, typename Base>		template <typename SourceOp, typename Base>
class SparseGetterOpConverter : public OpConversionPattern<SourceOp> {		class SparseGetterOpConverter : public OpConversionPattern<SourceOp> {
public:		public:
using OpAdaptor = typename SourceOp::Adaptor;		using OpAdaptor = typename SourceOp::Adaptor;
using OpConversionPattern<SourceOp>::OpConversionPattern;		using OpConversionPattern<SourceOp>::OpConversionPattern;
LogicalResult		LogicalResult
matchAndRewrite(SourceOp op, OpAdaptor adaptor,		matchAndRewrite(SourceOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
// Replace the requested pointer access with corresponding field.		// Replace the requested pointer access with corresponding field.
// The cast_op is inserted by type converter to intermix 1:N type		// The cast_op is inserted by type converter to intermix 1:N type
// conversion.		// conversion.
auto tuple = getTuple(adaptor.getTensor());		auto desc = getDescriptorFromTensorTuple(adaptor.getTensor());
unsigned idx = Base::getIndexForOp(tuple, op);		Value field = Base::getFieldForOp(desc, op);
auto fields = tuple.getInputs();		rewriter.replaceOp(op, field);
assert(idx < fields.size());
rewriter.replaceOp(op, fields[idx]);
return success();		return success();
}		}
};		};

/// Sparse codegen rule for pointer accesses.		/// Sparse codegen rule for pointer accesses.
class SparseToPointersConverter		class SparseToPointersConverter
: public SparseGetterOpConverter<ToPointersOp, SparseToPointersConverter> {		: public SparseGetterOpConverter<ToPointersOp, SparseToPointersConverter> {
public:		public:
using SparseGetterOpConverter::SparseGetterOpConverter;		using SparseGetterOpConverter::SparseGetterOpConverter;
// Callback for SparseGetterOpConverter.		// Callback for SparseGetterOpConverter.
static unsigned getIndexForOp(UnrealizedConversionCastOp /tuple/,		static Value getFieldForOp(const SparseTensorDescriptor &desc,
ToPointersOp op) {		ToPointersOp op) {
uint64_t dim = op.getDimension().getZExtValue();		uint64_t dim = op.getDimension().getZExtValue();
return getFieldIndex(op.getTensor().getType(), /ptrDim=/dim, -1u);		return desc.getPtrMemRef(dim);
}		}
};		};

/// Sparse codegen rule for index accesses.		/// Sparse codegen rule for index accesses.
class SparseToIndicesConverter		class SparseToIndicesConverter
: public SparseGetterOpConverter<ToIndicesOp, SparseToIndicesConverter> {		: public SparseGetterOpConverter<ToIndicesOp, SparseToIndicesConverter> {
public:		public:
using SparseGetterOpConverter::SparseGetterOpConverter;		using SparseGetterOpConverter::SparseGetterOpConverter;
// Callback for SparseGetterOpConverter.		// Callback for SparseGetterOpConverter.
static unsigned getIndexForOp(UnrealizedConversionCastOp /tuple/,		static Value getFieldForOp(const SparseTensorDescriptor &desc,
ToIndicesOp op) {		ToIndicesOp op) {
uint64_t dim = op.getDimension().getZExtValue();		uint64_t dim = op.getDimension().getZExtValue();
return getFieldIndex(op.getTensor().getType(), -1u, /idxDim=/dim);		return desc.getIdxMemRef(dim);
}		}
};		};

/// Sparse codegen rule for value accesses.		/// Sparse codegen rule for value accesses.
class SparseToValuesConverter		class SparseToValuesConverter
: public SparseGetterOpConverter<ToValuesOp, SparseToValuesConverter> {		: public SparseGetterOpConverter<ToValuesOp, SparseToValuesConverter> {
public:		public:
using SparseGetterOpConverter::SparseGetterOpConverter;		using SparseGetterOpConverter::SparseGetterOpConverter;
// Callback for SparseGetterOpConverter.		// Callback for SparseGetterOpConverter.
static unsigned getIndexForOp(UnrealizedConversionCastOp tuple,		static Value getFieldForOp(const SparseTensorDescriptor &desc,
ToValuesOp /op/) {		ToValuesOp /op/) {
// The last field holds the value buffer.		return desc.getValMemRef();
return tuple.getInputs().size() - 1;
}		}
};		};

/// Sparse codegen rule for the convert operator.		/// Sparse codegen rule for the convert operator.
class SparseConvertConverter : public OpConversionPattern<ConvertOp> {		class SparseConvertConverter : public OpConversionPattern<ConvertOp> {
public:		public:
using OpConversionPattern::OpConversionPattern;		using OpConversionPattern::OpConversionPattern;
LogicalResult		LogicalResult
Show All 15 Lines
class SparseNumberOfEntriesConverter		class SparseNumberOfEntriesConverter
: public OpConversionPattern<NumberOfEntriesOp> {		: public OpConversionPattern<NumberOfEntriesOp> {
public:		public:
using OpConversionPattern::OpConversionPattern;		using OpConversionPattern::OpConversionPattern;
LogicalResult		LogicalResult
matchAndRewrite(NumberOfEntriesOp op, OpAdaptor adaptor,		matchAndRewrite(NumberOfEntriesOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
// Query memSizes for the actually stored values size.		// Query memSizes for the actually stored values size.
auto tuple = getTuple(adaptor.getTensor());		auto desc = getDescriptorFromTensorTuple(adaptor.getTensor());
auto fields = tuple.getInputs();
unsigned lastField = fields.size() - 1;
Value field =		Value field =
constantIndex(rewriter, op.getLoc(), getMemSizesIndex(lastField));		constantIndex(rewriter, op.getLoc(), desc.getValMemSizesIndex());
rewriter.replaceOpWithNewOp<memref::LoadOp>(op, fields[memSizesIdx], field);		rewriter.replaceOpWithNewOp<memref::LoadOp>(op, desc.getMemSizesMemRef(),
		field);
return success();		return success();
}		}
};		};

} // namespace		} // namespace

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Sparse tensor type conversion into an actual buffer.		// Sparse tensor type conversion into an actual buffer.
Show All 38 Lines