This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/SparseTensor/IR/
-
mlir/
-
Dialect/
-
SparseTensor/
-
IR/
-
SparseTensor.h
-
lib/Dialect/SparseTensor/Transforms/
-
Dialect/
-
SparseTensor/
-
Transforms/
10/11
CodegenUtils.h
1/1
CodegenUtils.cpp
3/5
SparseTensorCodegen.cpp

Differential D138627

[mlir][sparse] Refactoring: abstract sparse tensor memory scheme into a SparseTensorDescriptor class.
ClosedPublic

Authored by Peiming on Nov 23 2022, 4:50 PM.

Download Raw Diff

Details

Reviewers

aartbik
nicolasvasilache
wrengr
bixia

Commits

rG8a7e69d145ff: [mlir][sparse] Refactoring: abstract sparse tensor memory scheme into a…

Summary

This patch abstracts sparse tensor memory scheme into a SparseTensorDescriptor class. Previously, the field accesses are performed in a relatively error-prone way, this patch hides the hairy details behind a SparseTensorDescriptor class to allow users access sparse tensor fields in a more cohesive way.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

Peiming created this revision.Nov 23 2022, 4:50 PM

Herald added a reviewer: aartbik. · View Herald TranscriptNov 23 2022, 4:50 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: jsetoain, Moerafaat, anlunx and 21 others. · View Herald Transcript

Peiming requested review of this revision.Nov 23 2022, 4:50 PM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptNov 23 2022, 4:50 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

minor improvement.

minor fixes.

Harbormaster completed remote builds in B199331: Diff 477663.Nov 23 2022, 6:33 PM

Peiming added reviewers: wrengr, bixia.Nov 30 2022, 9:38 AM

code cleanup.

Herald added a subscriber: hanchung. · View Herald TranscriptNov 30 2022, 12:21 PM

Harbormaster completed remote builds in B200341: Diff 479047.Nov 30 2022, 12:41 PM

replaces more places using SparseTensorDescriptor.

Harbormaster completed remote builds in B200389: Diff 479113.Nov 30 2022, 5:33 PM

add some comments.

Harbormaster completed remote builds in B200431: Diff 479170.Nov 30 2022, 10:35 PM

replace all SmallVectorImpl to SparseTensorDescriptor

Peiming edited the summary of this revision. (Show Details)Dec 1 2022, 9:21 AM

minor fixes.

aartbik added inline comments.Dec 1 2022, 9:57 AM

mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.cpp
99	This description describes the layout and thus drives all the code. Can you please move this into the header, inside the class documentation.

add some comments.

cleanup.

Peiming added inline comments.Dec 1 2022, 11:31 AM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorCodegen.cpp
158	A better way to create PushBackOp can be `PushBackOp(builder, loc, descriptor, fidx, value)`, which is more clear and can be read as "push a value into the sparse tensor descriptor at the given field index). But it would require us to put `SparseTensorDescriptor` to a publicly available place (and I am not sure whether it is wanted).

Peiming added a child revision: D139141: [mlir][sparse] add getPointerType/getIndexType to SparseTensorEncodingAttr..Dec 1 2022, 1:25 PM

Harbormaster completed remote builds in B200573: Diff 479357.Dec 1 2022, 1:36 PM

Peiming removed a child revision: D139141: [mlir][sparse] add getPointerType/getIndexType to SparseTensorEncodingAttr..Dec 1 2022, 1:50 PM

rebase

cleanup.

code cleanup.

Harbormaster completed remote builds in B200632: Diff 479434.Dec 1 2022, 6:16 PM

rebase

aartbik added inline comments.Dec 2 2022, 4:29 PM

mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h
444	I think this line should not be here? Copied from somewhere else?
465	relying on ad-hoc
485	, and a (Oxford comma ;-)
495	an array
498	layouted -> laid out? (I think that is the proper english, not sure though...)
520	to restore (no d)
545	it feels like we have many more getters than we really need? am I right, is there a way to condense the API a bit?
549	Although this solution is much better at information hiding than the original, we now need to scan each time for every field query. Although this is never very deep (rank bounded), it is still a bit wasteful. Can we avoid this by precomputing offsets per dimenison?
556	this feels very convoluted? why do we need a dim for a value query?
mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorCodegen.cpp
242	note that this works for now, but we actually planned to implement a heuristic here, which will need to scan the ranks + level types again
264	continue
378	yeah, agreed, this made more sense in the original but the alternative is to copy and create a fully new array....

remove unused APIs + fix typos.

mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h
549	I also think about it, we could, but storing the map will make the class expensive to copy. How about I do it in next revision and change all the descriptor to reference?
556	deleted.

Peiming added inline comments.Dec 2 2022, 5:08 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorCodegen.cpp
242	Then, you can have another foreach before this to compute the heuristic.

Harbormaster completed remote builds in B200871: Diff 479780.Dec 2 2022, 6:23 PM

rebase.

aartbik accepted this revision.Dec 5 2022, 2:08 PM

This revision is now accepted and ready to land.Dec 5 2022, 2:08 PM

This revision was landed with ongoing or failed builds.Dec 5 2022, 2:12 PM

Closed by commit rG8a7e69d145ff: [mlir][sparse] Refactoring: abstract sparse tensor memory scheme into a… (authored by Peiming). · Explain Why

This revision was automatically updated to reflect the committed changes.

Peiming added a commit: rG8a7e69d145ff: [mlir][sparse] Refactoring: abstract sparse tensor memory scheme into a….

Harbormaster completed remote builds in B201146: Diff 480148.Dec 5 2022, 2:23 PM

@Peiming, this patch breaks the flang build.

In D138627#3972470, @PeteSteinfeld wrote:

@Peiming, this patch breaks the flang build.

It is not flang specific issue, though. The following gcc buildbot fails: https://lab.llvm.org/buildbot/#/builders/160/builds/13724

In D138627#3972495, @vzakhari wrote:

In D138627#3972470, @PeteSteinfeld wrote:

@Peiming, this patch breaks the flang build.

It is not flang specific issue, though. The following gcc buildbot fails: https://lab.llvm.org/buildbot/#/builders/160/builds/13724

This also broke the windows mlir buildbot. Please address the issues or revert.

In D138627#3972566, @stella.stamenova wrote:

In D138627#3972495, @vzakhari wrote:

In D138627#3972470, @PeteSteinfeld wrote:

@Peiming, this patch breaks the flang build.

It is not flang specific issue, though. The following gcc buildbot fails: https://lab.llvm.org/buildbot/#/builders/160/builds/13724

This also broke the windows mlir buildbot. Please address the issues or revert.

Okay, Investigating

The failure seems unrelated to this patch...

stella.stamenova added a reverting change: rG10033a179f0c: Revert "[mlir][sparse] Refactoring: abstract sparse tensor memory scheme into a….Dec 5 2022, 5:21 PM

In D138627#3972566, @stella.stamenova wrote:

In D138627#3972495, @vzakhari wrote:

In D138627#3972470, @PeteSteinfeld wrote:

@Peiming, this patch breaks the flang build.

It is not flang specific issue, though. The following gcc buildbot fails: https://lab.llvm.org/buildbot/#/builders/160/builds/13724

This also broke the windows mlir buildbot. Please address the issues or revert.

The windows failure should be fixed by https://reviews.llvm.org/D139383

In D138627#3972777, @Peiming wrote:

The failure seems unrelated to this patch...

The build log seems to directly point to your code (https://lab.llvm.org/buildbot/#/builders/160/builds/13724/steps/5/logs/stdio):

In file included from ../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/SparseBufferRewriting.cpp:14:
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:396:13: error: explicit specialization in non-namespace scope ‘class mlir::sparse_tensor::SparseTensorDescriptorImpl<mut>’
  396 |   template <>
      |             ^
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:397:10: error: too few template-parameter-lists
  397 |   struct ArrayStorage<false> {
      |          ^~~~~~~~~~~~~~~~~~~
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:401:13: error: explicit specialization in non-namespace scope ‘class mlir::sparse_tensor::SparseTensorDescriptorImpl<mut>’
  401 |   template <>
      |             ^
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:402:10: error: too few template-parameter-lists
  402 |   struct ArrayStorage<true> {
      |          ^~~~~~~~~~~~~~~~~~

In D138627#3972803, @vzakhari wrote:

In D138627#3972777, @Peiming wrote:

The failure seems unrelated to this patch...

The build log seems to directly point to your code (https://lab.llvm.org/buildbot/#/builders/160/builds/13724/steps/5/logs/stdio):

In file included from ../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/SparseBufferRewriting.cpp:14:
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:396:13: error: explicit specialization in non-namespace scope ‘class mlir::sparse_tensor::SparseTensorDescriptorImpl<mut>’
  396 |   template <>
      |             ^
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:397:10: error: too few template-parameter-lists
  397 |   struct ArrayStorage<false> {
      |          ^~~~~~~~~~~~~~~~~~~
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:401:13: error: explicit specialization in non-namespace scope ‘class mlir::sparse_tensor::SparseTensorDescriptorImpl<mut>’
  401 |   template <>
      |             ^
../llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h:402:10: error: too few template-parameter-lists
  402 |   struct ArrayStorage<true> {
      |          ^~~~~~~~~~~~~~~~~~

Okay, I did not see this. I will try with GCC

@Peiming : I reverted your change because it broke multiple buildbots - windows and gcc included. I see that you've re-committed it. Did you make sure to address all three build breaks that were reported including the one @vzakhari pointed out? In cases like this, please make sure NOT to recommit your changes without making sure that all build breaks are addressed as it is disruptive to have buildbots that are broken especially when a change is identified as the source.

In D138627#3972826, @stella.stamenova wrote:

@Peiming : I reverted your change because it broke multiple buildbots - windows and gcc included. I see that you've re-committed it. Did you make sure to address all three build breaks that were reported including the one @vzakhari pointed out? In cases like this, please make sure NOT to recommit your changes without making sure that all build breaks are addressed as it is disruptive to have buildbots that are broken especially when a change is identified as the source.

Yes, of course. I haven't push the code, waiting the pre-merge windows build to finish.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

SparseTensor/

IR/

SparseTensor.h

15 lines

lib/

Dialect/

SparseTensor/

Transforms/

CodegenUtils.h

141 lines

CodegenUtils.cpp

100 lines

SparseTensorCodegen.cpp

314 lines

Diff 477663

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensor.h

Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	inline bool isCompressedDim(RankedTensorType type, uint64_t d) {
return isCompressedDLT(getDimLevelType(type, d));		return isCompressedDLT(getDimLevelType(type, d));
}		}

/// Convenience function to test for singleton dimension (0 <= d < rank).		/// Convenience function to test for singleton dimension (0 <= d < rank).
inline bool isSingletonDim(RankedTensorType type, uint64_t d) {		inline bool isSingletonDim(RankedTensorType type, uint64_t d) {
return isSingletonDLT(getDimLevelType(type, d));		return isSingletonDLT(getDimLevelType(type, d));
}		}

		/// Convenience function to test for dense dimension (0 <= d < rank).
		inline bool isDenseDim(SparseTensorEncodingAttr enc, uint64_t d) {
		return isDenseDLT(getDimLevelType(enc, d));
		}

		/// Convenience function to test for compressed dimension (0 <= d < rank).
		inline bool isCompressedDim(SparseTensorEncodingAttr enc, uint64_t d) {
		return isCompressedDLT(getDimLevelType(enc, d));
		}

		/// Convenience function to test for singleton dimension (0 <= d < rank).
		inline bool isSingletonDim(SparseTensorEncodingAttr enc, uint64_t d) {
		return isSingletonDLT(getDimLevelType(enc, d));
		}

//		//
// Dimension level properties.		// Dimension level properties.
//		//

/// Convenience function to test for ordered property in the		/// Convenience function to test for ordered property in the
/// given dimension (0 <= d < rank).		/// given dimension (0 <= d < rank).
inline bool isOrderedDim(RankedTensorType type, uint64_t d) {		inline bool isOrderedDim(RankedTensorType type, uint64_t d) {
return isOrderedDLT(getDimLevelType(type, d));		return isOrderedDLT(getDimLevelType(type, d));
Show All 27 Lines

mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.h

Show First 20 Lines • Show All 304 Lines • ▼ Show 20 Lines	inline Value constantDimLevelTypeEncoding(OpBuilder &builder, Location loc,
return constantI8(builder, loc, static_cast<uint8_t>(dlt));		return constantI8(builder, loc, static_cast<uint8_t>(dlt));
}		}

inline bool isZeroRankedTensorOrScalar(Type type) {		inline bool isZeroRankedTensorOrScalar(Type type) {
auto rtp = type.dyn_cast<RankedTensorType>();		auto rtp = type.dyn_cast<RankedTensorType>();
return !rtp \|\| rtp.getRank() == 0;		return !rtp \|\| rtp.getRank() == 0;
}		}

		class SparseTensorDescriptor {
		public:
		enum class FieldKind { DimSizes, MemSizes, PtrMemRef, IdxMemRef, ValMemRef };

		static void foreachFieldInSparseTensor(
		RankedTensorType,
		llvm::function_ref<bool(Type /fieldType/, unsigned /fieldIdx/,
		FieldKind /fieldKind/,
		unsigned /dim (if applicable)/,
		DimLevelType /DLT (if applicable)/)>);

		static unsigned getNumDataFieldsFromEncoding(SparseTensorEncodingAttr enc) {
		unsigned numDataFields = 1; // one value memref
		llvm::for_each(enc.getDimLevelType(), [&numDataFields](DimLevelType dlt) {
		if (isCompressedDLT(dlt))
		numDataFields += 2;
		else if (isSingletonDLT(dlt))
		numDataFields += 1;
		});
		return numDataFields;
		}

		static unsigned getNumFieldsFromEncoding(SparseTensorEncodingAttr enc) {
		return getNumDataFieldsFromEncoding(enc) +
		SparseTensorDescriptor::dataFieldIdx;
		}

		static unsigned getFieldMemSizesIndex(unsigned fid) {
		assert(fid >= dataFieldIdx);
		return fid - dataFieldIdx;
		}

		SparseTensorDescriptor(Type tp, ValueRange fields)
		: rType(tp.cast<RankedTensorType>()), fields(fields) {
		assert(getSparseTensorEncoding(tp) &&
		getNumFieldsFromEncoding(getSparseTensorEncoding(tp)) ==
		fields.size());
		}

		unsigned getPtrMemRefIndex(unsigned ptrDim) const {
		return getFieldIndex(ptrDim, FieldKind::PtrMemRef);
		}

		unsigned getIdxMemRefIndex(unsigned idxDim) const {
		return getFieldIndex(idxDim, FieldKind::IdxMemRef);
		}

		unsigned getValMemRefIndex() const { return fields.size() - 1; }

		unsigned getPtrMemSizesIndex(unsigned dim) const {
		return getPtrMemRefIndex(dim) - dataFieldIdx;
		}

		unsigned getIdxMemSizesIndex(unsigned dim) const {
		return getIdxMemRefIndex(dim) - dataFieldIdx;
		}

		unsigned getValMemSizesIndex() const {
		return getValMemRefIndex() - dataFieldIdx;
		}

		Value getDimSizesMemRef() const { return fields[dimSizesIdx]; }
		Value getMemSizesMemRef() const { return fields[memSizesIdx]; }

		Value getPtrMemRef(unsigned ptrDim) const {
		return fields[getPtrMemRefIndex(ptrDim)];
		}

		Value getIdxMemRef(unsigned idxDim) const {
		return fields[getIdxMemRefIndex(idxDim)];
		}

		Value getValMemRef() const { return fields[getValMemRefIndex()]; }

		Value getField(unsigned fid) const {
		assert(fid < fields.size());
		return fields[fid];
		}

		unsigned getNumFields() const { return fields.size(); }

		Type getPtrElementType() const {
		auto *ctx = rType.getContext();
		unsigned ptrWidth = getSparseTensorEncoding(rType).getPointerBitWidth();
		Type indexType = IndexType::get(ctx);
		return ptrWidth ? IntegerType::get(ctx, ptrWidth) : indexType;
		}

		Type getIdxElementType() const {
		auto *ctx = rType.getContext();
		unsigned idxWidth = getSparseTensorEncoding(rType).getIndexBitWidth();
		Type indexType = IndexType::get(ctx);
		return idxWidth ? IntegerType::get(ctx, idxWidth) : indexType;
		}

		public:
		// FIXME: This should be private.
		static constexpr uint64_t dimSizesIdx = 0;
		static constexpr uint64_t memSizesIdx = dimSizesIdx + 1;
		static constexpr uint64_t dataFieldIdx = memSizesIdx + 1;

		private:
		unsigned getFieldIndex(unsigned dim, FieldKind kind) const;

		RankedTensorType rType;
		ValueRange fields;
		};

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// SparseTensorLoopEmiter class, manages sparse tensors and helps to generate		// SparseTensorLoopEmiter class, manages sparse tensors and helps to
// loop structure to (co)-iterate sparse tensors.		// generate loop structure to (co)-iterate sparse tensors.
//		//
// An example usage:		// An example usage:
// To generate the following loops over T1<?x?> and T2<?x?>		// To generate the following loops over T1<?x?> and T2<?x?>
//		//
// for i in TENSOR_1_0 {		// for i in TENSOR_1_0 {
// for j : TENSOR_2_0 {		// for j : TENSOR_2_0 {
// for k : TENSOR_1_1 {}		// for k : TENSOR_1_1 {}
// for k : TENSOR_2_1 {}		// for k : TENSOR_2_1 {}
// }		// }
// }		// }
//		//
// One can use		// One can use
//		//
// SparseTensorLoopEmiter loopEmiter({T1, T1});		// SparseTensorLoopEmiter loopEmiter({T1, T1});
// loopEmiter.initializeLoopEmit();		// loopEmiter.initializeLoopEmit();
// loopEmiter.enterLoopOverTensorAtDim(T1, 0);		// loopEmiter.enterLoopOverTensorAtDim(T1, 0);
// loopEmiter.enterLoopOverTensorAtDim(T2, 0);		// loopEmiter.enterLoopOverTensorAtDim(T2, 0);
// loopEmiter.enterLoopOverTensorAtDim(T1, 1);		// loopEmiter.enterLoopOverTensorAtDim(T1, 1);
// loopEmiter.exitCurrentLoop();		// loopEmiter.exitCurrentLoop();
// loopEmiter.enterLoopOverTensorAtDim(T2, 1);		// loopEmiter.enterLoopOverTensorAtDim(T2, 1);
// loopEmiter.exitCurrentLoop(); // exit k		// loopEmiter.exitCurrentLoop(); // exit k
		aartbikUnsubmitted Done Reply Inline Actions I think this line should not be here? Copied from somewhere else? aartbik: I think this line should not be here? Copied from somewhere else?
// loopEmiter.exitCurrentLoop(); // exit j		// loopEmiter.exitCurrentLoop(); // exit j
// loopEmiter.exitCurrentLoop(); // exit i		// loopEmiter.exitCurrentLoop(); // exit i
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

class SparseTensorLoopEmitter {		class SparseTensorLoopEmitter {
public:		public:
/// Optional callback function to setup dense output tensors when		/// Optional callback function to setup dense output tensors when
/// initializing the loop emitter (e.g., to fill a dense output with zeros).		/// initializing the loop emitter (e.g., to fill a dense output with zeros).
using OutputUpdater = function_ref<Value(OpBuilder &builder, Location loc,		using OutputUpdater = function_ref<Value(OpBuilder &builder, Location loc,
Value memref, Value tensor)>;		Value memref, Value tensor)>;

/// Constructor: take an array of tensors inputs, on which the generated loops		/// Constructor: take an array of tensors inputs, on which the generated
/// will iterate on. The index of the tensor in the array is also the		/// loops will iterate on. The index of the tensor in the array is also the
/// tensor id (tid) used in related functions.		/// tensor id (tid) used in related functions.
/// If isSparseOut is set, loop emitter assume that the sparse output tensor		/// If isSparseOut is set, loop emitter assume that the sparse output tensor
/// is empty, and will always generate loops on it based on the dim sizes.		/// is empty, and will always generate loops on it based on the dim sizes.
/// An optional array could be provided (by sparsification) to indicate the		/// An optional array could be provided (by sparsification) to indicate the
/// loop id sequence that will be generated. It is used to establish the		/// loop id sequence that will be generated. It is used to establish the
/// mapping between affineDimExpr to the corresponding loop index in the loop		/// mapping between affineDimExpr to the corresponding loop index in the
/// stack that are maintained by the loop emitter.		/// loop stack that are maintained by the loop emitter.
explicit SparseTensorLoopEmitter(ValueRange tensors,		explicit SparseTensorLoopEmitter(ValueRange tensors,
		aartbikUnsubmitted Done Reply Inline Actions relying on ad-hoc aartbik: relying on ad-hoc
StringAttr loopTag = nullptr,		StringAttr loopTag = nullptr,
bool hasOutput = false,		bool hasOutput = false,
bool isSparseOut = false,		bool isSparseOut = false,
ArrayRef<unsigned> topSort = {});		ArrayRef<unsigned> topSort = {});

/// Starts a loop emitting session by generating all the buffers needed to		/// Starts a loop emitting session by generating all the buffers needed to
/// iterate tensors.		/// iterate tensors.
void initializeLoopEmit(OpBuilder &builder, Location loc,		void initializeLoopEmit(OpBuilder &builder, Location loc,
OutputUpdater updater = nullptr);		OutputUpdater updater = nullptr);

/// Generates a list of operations to compute the affine expression.		/// Generates a list of operations to compute the affine expression.
Value genAffine(OpBuilder &builder, AffineExpr a, Location loc);		Value genAffine(OpBuilder &builder, AffineExpr a, Location loc);

/// Enters a new loop sequence, the loops within the same sequence starts from		/// Enters a new loop sequence, the loops within the same sequence starts
/// the break points of previous loop instead of starting over from 0.		/// from the break points of previous loop instead of starting over from 0.
/// e.g.,		/// e.g.,
/// {		/// {
/// // loop sequence start.		/// // loop sequence start.
/// p0 = while(xxx)		/// p0 = while(xxx)
/// ...		/// ...
		aartbikUnsubmitted Done Reply Inline Actions , and a (Oxford comma ;-) aartbik: , and a (Oxford comma ;-)
/// break p0		/// break p0
///		///
/// // Starts loop from p0		/// // Starts loop from p0
/// for (i = p0; i < end; i++)		/// for (i = p0; i < end; i++)
/// ...		/// ...
/// // loop sequence end.		/// // loop sequence end.
/// }		/// }
void enterNewLoopSeq(OpBuilder &builder, Location loc, ArrayRef<size_t> tids,		void enterNewLoopSeq(OpBuilder &builder, Location loc, ArrayRef<size_t> tids,
ArrayRef<size_t> dims);		ArrayRef<size_t> dims);

		aartbikUnsubmitted Done Reply Inline Actions an array aartbik: an array
// exit the current loop sequence, this will reset universal index to 0.		// exit the current loop sequence, this will reset universal index to 0.
void exitCurrentLoopSeq() {		void exitCurrentLoopSeq() {
assert(loopSeqStack.size() == loopStack.size() + 1);		assert(loopSeqStack.size() == loopStack.size() + 1);
		aartbikUnsubmitted Done Reply Inline Actions layouted -> laid out? (I think that is the proper english, not sure though...) aartbik: layouted -> laid out? (I think that is the proper english, not sure though...)
loopSeqStack.pop_back();		loopSeqStack.pop_back();
}		}

// TODO: Gets rid of `dim` in the argument list? Track the dimension we		// TODO: Gets rid of `dim` in the argument list? Track the dimension we
// are currently at internally. Then it would be enterNextDimForTensor.		// are currently at internally. Then it would be enterNextDimForTensor.
// Still need a way to specify the dim for non annoated dense tensor though,		// Still need a way to specify the dim for non annoated dense tensor though,
// as it can be accessed out of order.		// as it can be accessed out of order.
/// Emits loop over tensor_tid_dim, it assumes that loops between		/// Emits loop over tensor_tid_dim, it assumes that loops between
/// tensor_tid_[0, dim - 1] have already been generated.		/// tensor_tid_[0, dim - 1] have already been generated.
/// The function will also perform in-place update on the `reduc` vector to		/// The function will also perform in-place update on the `reduc` vector to
/// return the reduction variable used inside the generated loop.		/// return the reduction variable used inside the generated loop.
Operation *enterLoopOverTensorAtDim(OpBuilder &builder, Location loc,		Operation *enterLoopOverTensorAtDim(OpBuilder &builder, Location loc,
size_t tid, size_t dim,		size_t tid, size_t dim,
MutableArrayRef<Value> reduc = {},		MutableArrayRef<Value> reduc = {},
bool isParallel = false,		bool isParallel = false,
ArrayRef<size_t> extraTids = {},		ArrayRef<size_t> extraTids = {},
ArrayRef<size_t> extraDims = {});		ArrayRef<size_t> extraDims = {});

Operation *enterFilterLoopOverTensorAtDim(OpBuilder &builder, Location loc,		Operation *enterFilterLoopOverTensorAtDim(OpBuilder &builder, Location loc,
size_t tid, size_t dim,		size_t tid, size_t dim,
AffineExpr affine,		AffineExpr affine,
MutableArrayRef<Value> reduc = {});		MutableArrayRef<Value> reduc = {});
		aartbikUnsubmitted Done Reply Inline Actions to restore (no d) aartbik: to restore (no d)

void genDenseAffineAddressAtCurLevel(OpBuilder &builder, Location loc,		void genDenseAffineAddressAtCurLevel(OpBuilder &builder, Location loc,
size_t tid, size_t dim,		size_t tid, size_t dim,
AffineExpr affine);		AffineExpr affine);

/// Emits a co-iteration loop over a set of tensors.		/// Emits a co-iteration loop over a set of tensors.
Operation *enterCoIterationOverTensorsAtDims(		Operation *enterCoIterationOverTensorsAtDims(
OpBuilder &builder, Location loc, ArrayRef<size_t> tids,		OpBuilder &builder, Location loc, ArrayRef<size_t> tids,
ArrayRef<size_t> dims, bool needsUniv, MutableArrayRef<Value> reduc = {},		ArrayRef<size_t> dims, bool needsUniv, MutableArrayRef<Value> reduc = {},
ArrayRef<size_t> extraTids = {}, ArrayRef<size_t> extraDims = {});		ArrayRef<size_t> extraTids = {}, ArrayRef<size_t> extraDims = {});

void exitCurrentLoop(RewriterBase &rewriter, Location loc,		void exitCurrentLoop(RewriterBase &rewriter, Location loc,
MutableArrayRef<Value> reduc = {});		MutableArrayRef<Value> reduc = {});

/// Returns the array of coordinate for all the loop generated till now.		/// Returns the array of coordinate for all the loop generated till now.
void getCoordinateArray(SmallVectorImpl<Value> &coords) const {		void getCoordinateArray(SmallVectorImpl<Value> &coords) const {
for (auto &l : loopStack)		for (auto &l : loopStack)
coords.push_back(l.iv);		coords.push_back(l.iv);
}		}

/// Gets loop induction variable at the given level.		/// Gets loop induction variable at the given level.
unsigned getCurrentDepth() const { return loopStack.size(); }		unsigned getCurrentDepth() const { return loopStack.size(); }

/// Gets loop induction variable at the given level.		/// Gets loop induction variable at the given level.
Value getLoopIV(size_t level) const {		Value getLoopIV(size_t level) const {
		aartbikUnsubmitted Done Reply Inline Actions it feels like we have many more getters than we really need? am I right, is there a way to condense the API a bit? aartbik: it feels like we have many more getters than we really need? am I right, is there a way to…
if (level < loopStack.size())		if (level < loopStack.size())
return loopStack[level].iv;		return loopStack[level].iv;
return nullptr;		return nullptr;
}		}
		aartbikUnsubmitted Not Done Reply Inline Actions Although this solution is much better at information hiding than the original, we now need to scan each time for every field query. Although this is never very deep (rank bounded), it is still a bit wasteful. Can we avoid this by precomputing offsets per dimenison? aartbik: Although this solution is much better at information hiding than the original, we now need to…
		PeimingAuthorUnsubmitted Done Reply Inline Actions I also think about it, we could, but storing the map will make the class expensive to copy. How about I do it in next revision and change all the descriptor to reference? Peiming: I also think about it, we could, but storing the map will make the class expensive to copy.

///		///
/// Getters.		/// Getters.
///		///
const std::vector<std::vector<Value>> &getPidxs() const { return pidxs; };		const std::vector<std::vector<Value>> &getPidxs() const { return pidxs; };
const std::vector<std::vector<Value>> &getCoord() const { return coord; };		const std::vector<std::vector<Value>> &getCoord() const { return coord; };
const std::vector<std::vector<Value>> &getHighs() const { return highs; };		const std::vector<std::vector<Value>> &getHighs() const { return highs; };
		aartbikUnsubmitted Done Reply Inline Actions this feels very convoluted? why do we need a dim for a value query? aartbik: this feels very convoluted? why do we need a dim for a value query?
		PeimingAuthorUnsubmitted Done Reply Inline Actions deleted. Peiming: deleted.
const std::vector<std::vector<Value>> &getPtrBuffer() const {		const std::vector<std::vector<Value>> &getPtrBuffer() const {
return ptrBuffer;		return ptrBuffer;
};		};
const std::vector<std::vector<Value>> &getIdxBuffer() const {		const std::vector<std::vector<Value>> &getIdxBuffer() const {
return idxBuffer;		return idxBuffer;
};		};
const std::vector<Value> &getValBuffer() const { return valBuffer; };		const std::vector<Value> &getValBuffer() const { return valBuffer; };

▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	private:
/// will be transformed into		/// will be transformed into
/// %ret = parallel () init(%args) {		/// %ret = parallel () init(%args) {
/// ...		/// ...
/// scf.reduce(%c) bb0(%0, %1){		/// scf.reduce(%c) bb0(%0, %1){
/// %val = op %0, %1		/// %val = op %0, %1
/// scf.reduce.return %val		/// scf.reduce.return %val
/// }		/// }
/// }		/// }
/// NOTE: only one instruction will be moved into reduce block, transformation		/// NOTE: only one instruction will be moved into reduce block,
/// will fail if multiple instructions are used to compute the reduction		/// transformation will fail if multiple instructions are used to compute
/// value.		/// the reduction value. Return %ret to user, while %val is provided by
/// Return %ret to user, while %val is provided by users (`reduc`).		/// users (`reduc`).
void exitForLoop(RewriterBase &rewriter, Location loc,		void exitForLoop(RewriterBase &rewriter, Location loc,
MutableArrayRef<Value> reduc);		MutableArrayRef<Value> reduc);

/// Exits a while loop, returns the reduction results.		/// Exits a while loop, returns the reduction results.
void exitCoIterationLoop(OpBuilder &builder, Location loc,		void exitCoIterationLoop(OpBuilder &builder, Location loc,
MutableArrayRef<Value> reduc);		MutableArrayRef<Value> reduc);

/// A optional string attribute that should be attached to the loop generated		/// A optional string attribute that should be attached to the loop
/// by loop emitter, it might help following passes to identify loops that		/// generated by loop emitter, it might help following passes to identify
/// operates on sparse tensors more easily.		/// loops that operates on sparse tensors more easily.
StringAttr loopTag;		StringAttr loopTag;
/// Whether the loop emitter needs to treat the last tensor as the output		/// Whether the loop emitter needs to treat the last tensor as the output
/// tensor.		/// tensor.
bool hasOutput;		bool hasOutput;
bool isSparseOut;		bool isSparseOut;
/// Input and (optional) output tensors.		/// Input and (optional) output tensors.
std::vector<Value> tensors;		std::vector<Value> tensors;
/// The dim type array for each tensor.		/// The dim type array for each tensor.
std::vector<std::vector<DimLevelType>> dimTypes;		std::vector<std::vector<DimLevelType>> dimTypes;
/// Sparse iteration information (by tensor and dim). These arrays		/// Sparse iteration information (by tensor and dim). These arrays
/// are updated to remain current within the current loop.		/// are updated to remain current within the current loop.
std::vector<std::vector<Value>> pidxs;		std::vector<std::vector<Value>> pidxs;
std::vector<std::vector<Value>> coord;		std::vector<std::vector<Value>> coord;
std::vector<std::vector<Value>> highs;		std::vector<std::vector<Value>> highs;
std::vector<std::vector<Value>> ptrBuffer; // to_pointers		std::vector<std::vector<Value>> ptrBuffer; // to_pointers
std::vector<std::vector<Value>> idxBuffer; // to_indices		std::vector<std::vector<Value>> idxBuffer; // to_indices
std::vector<Value> valBuffer; // to_value		std::vector<Value> valBuffer; // to_value

// Loop Stack, stores the information of all the nested loops that are alive.		// Loop Stack, stores the information of all the nested loops that are
		// alive.
std::vector<LoopLevelInfo> loopStack;		std::vector<LoopLevelInfo> loopStack;

// Loop Sequence Stack, stores the unversial index for the current loop		// Loop Sequence Stack, stores the unversial index for the current loop
// sequence.		// sequence.
std::vector<Value> loopSeqStack;		std::vector<Value> loopSeqStack;

// Maps AffineDimExpr to the index of the loop in loopStack.		// Maps AffineDimExpr to the index of the loop in loopStack.
// TODO: We should probably use a callback function here to make it more		// TODO: We should probably use a callback function here to make it more
Show All 12 Lines

mlir/lib/Dialect/SparseTensor/Transforms/CodegenUtils.cpp

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	static Value genIndexAndValueForDense(OpBuilder &builder, Location loc,
Value tensor,		Value tensor,
SmallVectorImpl<Value> &indicesArray,		SmallVectorImpl<Value> &indicesArray,
ValueRange ivs) {		ValueRange ivs) {
Value val = genValueForDense(builder, loc, tensor, ivs);		Value val = genValueForDense(builder, loc, tensor, ivs);
indicesArray.append(ivs.begin(), ivs.end());		indicesArray.append(ivs.begin(), ivs.end());
return val;		return val;
}		}

		/// Returns field index of sparse tensor type for pointers/indices, when set.
		unsigned SparseTensorDescriptor::getFieldIndex(unsigned dim,
		FieldKind kind) const {
		unsigned fieldIdx = -1u;
		foreachFieldInSparseTensor(
		rType,
		[dim, kind, &fieldIdx](Type /fieldType/, unsigned fIdx, FieldKind fKind,
		aartbikUnsubmitted Done Reply Inline Actions This description describes the layout and thus drives all the code. Can you please move this into the header, inside the class documentation. aartbik: This description describes the layout and thus drives all the code. Can you please move this…
		unsigned fDim, DimLevelType dlt) -> bool {
		if (fDim == dim && kind == fKind) {
		fieldIdx = fIdx;
		// Returns false to break the iteration.
		return false;
		}
		return true;
		});
		assert(fieldIdx != -1u);
		return fieldIdx;
		}

		void SparseTensorDescriptor::foreachFieldInSparseTensor(
		RankedTensorType rType,
		llvm::function_ref<bool(
		Type /fieldType/, unsigned /fieldIdx/, FieldKind /fieldKind/,
		unsigned /dim (if applicable)/, DimLevelType /DLT (if applicable)/)>
		callback) {
		//
		// Sparse tensor storage scheme for rank-dimensional tensor is organized
		// as a single compound type with the following fields. Note that every
		// memref with ? size actually behaves as a "vector", i.e. the stored
		// size is the capacity and the used size resides in the memSizes array.
		//
		// struct {
		// memref<rank x index> dimSizes ; size in each dimension
		// memref<n x index> memSizes ; sizes of ptrs/inds/values
		// ; per-dimension d:
		// ; if dense:
		// <nothing>
		// ; if compresed:
		// memref<? x ptr> pointers-d ; pointers for sparse dim d
		// memref<? x idx> indices-d ; indices for sparse dim d
		// ; if singleton:
		// memref<? x idx> indices-d ; indices for singleton dim d
		// memref<? x eltType> values ; values
		// };
		//
		// The dimSizes array and memSizes array.
		auto enc = getSparseTensorEncoding(rType);
		assert(enc);
		// Construct the basic types.
		auto *context = rType.getContext();
		unsigned idxWidth = enc.getIndexBitWidth();
		unsigned ptrWidth = enc.getPointerBitWidth();
		Type indexType = IndexType::get(context);
		Type idxType = idxWidth ? IntegerType::get(context, idxWidth) : indexType;
		Type ptrType = ptrWidth ? IntegerType::get(context, ptrWidth) : indexType;
		Type eltType = rType.getElementType();
		unsigned rank = rType.getShape().size();

		Type dimSizeType = MemRefType::get({rank}, indexType);
		Type memSizeType =
		MemRefType::get({getNumDataFieldsFromEncoding(enc)}, indexType);
		Type ptrMemType = MemRefType::get({ShapedType::kDynamic}, ptrType);
		Type idxMemType = MemRefType::get({ShapedType::kDynamic}, idxType);
		Type valMemType = MemRefType::get({ShapedType::kDynamic}, eltType);

		#define RETURN_ON_FALSE(type, idx, kind, dim, dlt) \
		if (!(callback(type, idx, kind, dim, dlt))) \
		return;

		RETURN_ON_FALSE(dimSizeType, dimSizesIdx, FieldKind::DimSizes, -1u,
		DimLevelType::Undef);
		RETURN_ON_FALSE(memSizeType, memSizesIdx, FieldKind::MemSizes, -1u,
		DimLevelType::Undef);

		static_assert(dataFieldIdx == memSizesIdx + 1);
		unsigned fieldIdx = dataFieldIdx;
		// Per-dimension storage.
		for (unsigned r = 0, rank = enc.getDimLevelType().size(); r < rank; r++) {
		// Dimension level types apply in order to the reordered dimension.
		// As a result, the compound type can be constructed directly in the given
		// order.
		auto dlt = getDimLevelType(enc, r);
		if (isCompressedDLT(dlt)) {
		RETURN_ON_FALSE(ptrMemType, fieldIdx++, FieldKind::PtrMemRef, r, dlt);
		RETURN_ON_FALSE(idxMemType, fieldIdx++, FieldKind::IdxMemRef, r, dlt);
		} else if (isSingletonDLT(dlt)) {
		RETURN_ON_FALSE(idxMemType, fieldIdx++, FieldKind::IdxMemRef, r, dlt);
		} else {
		assert(isDenseDLT(dlt)); // no fields
		}
		}
		// The values array.
		RETURN_ON_FALSE(valMemType, fieldIdx++, FieldKind::ValMemRef, -1u,
		DimLevelType::Undef);

		#undef RETURN_ON_FALSE

		assert(fieldIdx == getNumFieldsFromEncoding(enc));
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Sparse tensor loop emitter class implementations		// Sparse tensor loop emitter class implementations
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

SparseTensorLoopEmitter::SparseTensorLoopEmitter(ValueRange tensors,		SparseTensorLoopEmitter::SparseTensorLoopEmitter(ValueRange tensors,
StringAttr loopTag,		StringAttr loopTag,
bool hasOutput,		bool hasOutput,
bool isSparseOut,		bool isSparseOut,
▲ Show 20 Lines • Show All 1,069 Lines • Show Last 20 Lines

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorCodegen.cpp

Show All 24 Lines
#include "mlir/Dialect/SparseTensor/Transforms/Passes.h"		#include "mlir/Dialect/SparseTensor/Transforms/Passes.h"
#include "mlir/Dialect/Tensor/IR/Tensor.h"		#include "mlir/Dialect/Tensor/IR/Tensor.h"
#include "mlir/Transforms/DialectConversion.h"		#include "mlir/Transforms/DialectConversion.h"

using namespace mlir;		using namespace mlir;
using namespace mlir::sparse_tensor;		using namespace mlir::sparse_tensor;

namespace {		namespace {

static constexpr uint64_t dimSizesIdx = 0;
static constexpr uint64_t memSizesIdx = 1;
static constexpr uint64_t fieldsIdx = 2;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Helper methods.		// Helper methods.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Returns the "tuple" value of the adapted tensor.		/// Returns the "tuple" value of the adapted tensor.
static UnrealizedConversionCastOp getTuple(Value tensor) {		static UnrealizedConversionCastOp getTuple(Value tensor) {
return llvm::cast<UnrealizedConversionCastOp>(tensor.getDefiningOp());		return llvm::cast<UnrealizedConversionCastOp>(tensor.getDefiningOp());
}		}

		static SparseTensorDescriptor getDescriptorFromTensorTuple(Value tensor) {
		auto tuple = getTuple(tensor);
		return SparseTensorDescriptor(tuple.getResultTypes()[0], tuple.getInputs());
		}

/// Packs the given values as a "tuple" value.		/// Packs the given values as a "tuple" value.
static Value genTuple(OpBuilder &builder, Location loc, Type tp,		static Value genTuple(OpBuilder &builder, Location loc, Type tp,
ValueRange values) {		ValueRange values) {
return builder.create<UnrealizedConversionCastOp>(loc, TypeRange(tp), values)		return builder.create<UnrealizedConversionCastOp>(loc, TypeRange(tp), values)
.getResult(0);		.getResult(0);
}		}

/// Flatten a list of operands that may contain sparse tensors.		/// Flatten a list of operands that may contain sparse tensors.
Show All 36 Lines	static void genStore(OpBuilder &builder, Location loc, Value val, Value mem,
idx = toType(builder, loc, idx, builder.getIndexType());		idx = toType(builder, loc, idx, builder.getIndexType());
val = toType(builder, loc, val,		val = toType(builder, loc, val,
mem.getType().cast<ShapedType>().getElementType());		mem.getType().cast<ShapedType>().getElementType());
builder.create<memref::StoreOp>(loc, val, mem, idx);		builder.create<memref::StoreOp>(loc, val, mem, idx);
}		}

/// Creates a straightforward counting for-loop.		/// Creates a straightforward counting for-loop.
static scf::ForOp createFor(OpBuilder &builder, Location loc, Value upper,		static scf::ForOp createFor(OpBuilder &builder, Location loc, Value upper,
SmallVectorImpl<Value> &fields,		MutableArrayRef<Value> fields,
Value lower = Value()) {		Value lower = Value()) {
Type indexType = builder.getIndexType();		Type indexType = builder.getIndexType();
if (!lower)		if (!lower)
lower = constantZero(builder, loc, indexType);		lower = constantZero(builder, loc, indexType);
Value one = constantOne(builder, loc, indexType);		Value one = constantOne(builder, loc, indexType);
scf::ForOp forOp = builder.create<scf::ForOp>(loc, lower, upper, one, fields);		scf::ForOp forOp = builder.create<scf::ForOp>(loc, lower, upper, one, fields);
for (unsigned i = 0, e = fields.size(); i < e; i++)		for (unsigned i = 0, e = fields.size(); i < e; i++)
fields[i] = forOp.getRegionIterArg(i);		fields[i] = forOp.getRegionIterArg(i);
Show All 14 Lines	static Optional<Value> sizeFromTensorAtDim(OpBuilder &builder, Location loc,
// Access into static dimension can query original type directly.		// Access into static dimension can query original type directly.
// Note that this is typically already done by DimOp's folding.		// Note that this is typically already done by DimOp's folding.
auto shape = tensorTp.getShape();		auto shape = tensorTp.getShape();
if (!ShapedType::isDynamic(shape[dim]))		if (!ShapedType::isDynamic(shape[dim]))
return constantIndex(builder, loc, shape[dim]);		return constantIndex(builder, loc, shape[dim]);

// Any other query can consult the dimSizes array at field DimSizesIdx,		// Any other query can consult the dimSizes array at field DimSizesIdx,
// accounting for the reordering applied to the sparse storage.		// accounting for the reordering applied to the sparse storage.
auto tuple = getTuple(adaptedValue);		auto desc = getDescriptorFromTensorTuple(adaptedValue);
Value idx = constantIndex(builder, loc, toStoredDim(tensorTp, dim));		Value idx = constantIndex(builder, loc, toStoredDim(tensorTp, dim));
return builder		return builder.create<memref::LoadOp>(loc, desc.getDimSizesMemRef(), idx)
.create<memref::LoadOp>(loc, tuple.getInputs()[dimSizesIdx], idx)
.getResult();		.getResult();
}		}

// Gets the dimension size at the given stored dimension 'd', either as a		// Gets the dimension size at the given stored dimension 'd', either as a
// constant for a static size, or otherwise dynamically through memSizes.		// constant for a static size, or otherwise dynamically through memSizes.
Value sizeAtStoredDim(OpBuilder &builder, Location loc, RankedTensorType rtp,		Value sizeAtStoredDim(OpBuilder &builder, Location loc, RankedTensorType rtp,
SmallVectorImpl<Value> &fields, unsigned d) {		SmallVectorImpl<Value> &fields, unsigned d) {
unsigned dim = toOrigDim(rtp, d);		unsigned dim = toOrigDim(rtp, d);
auto shape = rtp.getShape();		auto shape = rtp.getShape();
if (!ShapedType::isDynamic(shape[dim]))		if (!ShapedType::isDynamic(shape[dim]))
return constantIndex(builder, loc, shape[dim]);		return constantIndex(builder, loc, shape[dim]);
return genLoad(builder, loc, fields[dimSizesIdx],		return genLoad(builder, loc, fields[SparseTensorDescriptor::dimSizesIdx],
constantIndex(builder, loc, d));		constantIndex(builder, loc, d));
}		}

/// Translates field index to memSizes index.
static unsigned getMemSizesIndex(unsigned field) {
assert(fieldsIdx <= field);
return field - fieldsIdx;
}

/// Creates a pushback op for given field and updates the fields array		/// Creates a pushback op for given field and updates the fields array
/// accordingly. This operation also updates the memSizes contents.		/// accordingly. This operation also updates the memSizes contents.
static void createPushback(OpBuilder &builder, Location loc,		static void createPushback(OpBuilder &builder, Location loc,
SmallVectorImpl<Value> &fields, unsigned field,		SmallVectorImpl<Value> &fields, unsigned field,
Value value, Value repeat = Value()) {		Value value, Value repeat = Value()) {
assert(fieldsIdx <= field && field < fields.size());		assert(field < fields.size());
Type etp = fields[field].getType().cast<ShapedType>().getElementType();		Type etp = fields[field].getType().cast<ShapedType>().getElementType();
fields[field] = builder.create<PushBackOp>(		fields[field] = builder.create<PushBackOp>(
loc, fields[field].getType(), fields[memSizesIdx], fields[field],		loc, fields[field].getType(), fields[SparseTensorDescriptor::memSizesIdx],
toType(builder, loc, value, etp), APInt(64, getMemSizesIndex(field)),		fields[field], toType(builder, loc, value, etp),
repeat);		APInt(64, SparseTensorDescriptor::getFieldMemSizesIndex(field)), repeat);
		PeimingAuthorUnsubmitted Done Reply Inline Actions A better way to create PushBackOp can be `PushBackOp(builder, loc, descriptor, fidx, value)`, which is more clear and can be read as "push a value into the sparse tensor descriptor at the given field index). But it would require us to put `SparseTensorDescriptor` to a publicly available place (and I am not sure whether it is wanted). Peiming: A better way to create PushBackOp can be `PushBackOp(builder, loc, descriptor, fidx, value)`…
}

/// Returns field index of sparse tensor type for pointers/indices, when set.
static unsigned getFieldIndex(Type type, unsigned ptrDim, unsigned idxDim) {
assert(getSparseTensorEncoding(type));
RankedTensorType rType = type.cast<RankedTensorType>();
unsigned field = fieldsIdx; // start past header
for (unsigned r = 0, rank = rType.getShape().size(); r < rank; r++) {
if (isCompressedDim(rType, r)) {
if (r == ptrDim)
return field;
field++;
if (r == idxDim)
return field;
field++;
} else if (isSingletonDim(rType, r)) {
if (r == idxDim)
return field;
field++;
} else {
assert(isDenseDim(rType, r)); // no fields
}
}
assert(ptrDim == -1u && idxDim == -1u);
return field + 1; // return values field index
}		}

/// Maps a sparse tensor type to the appropriate compounded buffers.		/// Maps a sparse tensor type to the appropriate compounded buffers.
static Optional<LogicalResult>		static Optional<LogicalResult>
convertSparseTensorType(Type type, SmallVectorImpl<Type> &fields) {		convertSparseTensorType(Type type, SmallVectorImpl<Type> &fields) {
auto enc = getSparseTensorEncoding(type);		auto enc = getSparseTensorEncoding(type);
if (!enc)		if (!enc)
return llvm::None;		return llvm::None;
// Construct the basic types.
auto *context = type.getContext();
unsigned idxWidth = enc.getIndexBitWidth();
unsigned ptrWidth = enc.getPointerBitWidth();
RankedTensorType rType = type.cast<RankedTensorType>();		RankedTensorType rType = type.cast<RankedTensorType>();
Type indexType = IndexType::get(context);		SparseTensorDescriptor::foreachFieldInSparseTensor(
Type idxType = idxWidth ? IntegerType::get(context, idxWidth) : indexType;		rType,
Type ptrType = ptrWidth ? IntegerType::get(context, ptrWidth) : indexType;		[&fields](Type fieldType, unsigned fieldIdx,
Type eltType = rType.getElementType();		SparseTensorDescriptor::FieldKind /fieldKind/,
//		unsigned /dim/, DimLevelType /dlt/) -> bool {
// Sparse tensor storage scheme for rank-dimensional tensor is organized		assert(fieldIdx == fields.size());
// as a single compound type with the following fields. Note that every		fields.push_back(fieldType);
// memref with ? size actually behaves as a "vector", i.e. the stored		return true;
// size is the capacity and the used size resides in the memSizes array.		});
//
// struct {
// memref<rank x index> dimSizes ; size in each dimension
// memref<n x index> memSizes ; sizes of ptrs/inds/values
// ; per-dimension d:
// ; if dense:
// <nothing>
// ; if compresed:
// memref<? x ptr> pointers-d ; pointers for sparse dim d
// memref<? x idx> indices-d ; indices for sparse dim d
// ; if singleton:
// memref<? x idx> indices-d ; indices for singleton dim d
// memref<? x eltType> values ; values
// };
//
unsigned rank = rType.getShape().size();
unsigned lastField = getFieldIndex(type, -1u, -1u);
// The dimSizes array and memSizes array.
fields.push_back(MemRefType::get({rank}, indexType));
fields.push_back(MemRefType::get({getMemSizesIndex(lastField)}, indexType));
// Per-dimension storage.
for (unsigned r = 0; r < rank; r++) {
// Dimension level types apply in order to the reordered dimension.
// As a result, the compound type can be constructed directly in the given
// order. Clients of this type know what field is what from the sparse
// tensor type.
if (isCompressedDim(rType, r)) {
fields.push_back(MemRefType::get({ShapedType::kDynamic}, ptrType));
fields.push_back(MemRefType::get({ShapedType::kDynamic}, idxType));
} else if (isSingletonDim(rType, r)) {
fields.push_back(MemRefType::get({ShapedType::kDynamic}, idxType));
} else {
assert(isDenseDim(rType, r)); // no fields
}
}
// The values array.
fields.push_back(MemRefType::get({ShapedType::kDynamic}, eltType));
assert(fields.size() == lastField);
return success();		return success();
}		}

/// Generates code that allocates a sparse storage scheme for given rank.		/// Generates code that allocates a sparse storage scheme for given rank.
static void allocSchemeForRank(OpBuilder &builder, Location loc,		static void allocSchemeForRank(OpBuilder &builder, Location loc,
RankedTensorType rtp,		RankedTensorType rtp,
SmallVectorImpl<Value> &fields, unsigned field,		SmallVectorImpl<Value> &fields, unsigned field,
unsigned r0) {		unsigned r0) {
Show All 9 Lines	if (isCompressedDim(rtp, r)) {
Type ptrType = ptrWidth ? builder.getIntegerType(ptrWidth) : indexType;		Type ptrType = ptrWidth ? builder.getIntegerType(ptrWidth) : indexType;
Value ptrZero = constantZero(builder, loc, ptrType);		Value ptrZero = constantZero(builder, loc, ptrType);
createPushback(builder, loc, fields, field, ptrZero, linear);		createPushback(builder, loc, fields, field, ptrZero, linear);
return;		return;
}		}
if (isSingletonDim(rtp, r)) {		if (isSingletonDim(rtp, r)) {
return; // nothing to do		return; // nothing to do
} // Keep compounding the size, but nothing needs to be initialized		} // Keep compounding the size, but nothing needs to be initialized
// at this level. We will eventually reach a compressed level or		// at this level. We will eventually reach a compressed level or
// otherwise the values array for the from-here "all-dense" case.		// otherwise the values array for the from-here "all-dense" case.
assert(isDenseDim(rtp, r));		assert(isDenseDim(rtp, r));
Value size = sizeAtStoredDim(builder, loc, rtp, fields, r);		Value size = sizeAtStoredDim(builder, loc, rtp, fields, r);
linear = builder.create<arith::MulIOp>(loc, linear, size);		linear = builder.create<arith::MulIOp>(loc, linear, size);
}		}
// Reached values array so prepare for an insertion.		// Reached values array so prepare for an insertion.
Value valZero = constantZero(builder, loc, rtp.getElementType());		Value valZero = constantZero(builder, loc, rtp.getElementType());
createPushback(builder, loc, fields, field, valZero, linear);		createPushback(builder, loc, fields, field, valZero, linear);
assert(fields.size() == ++field);		assert(fields.size() == field + 1);
}		}

/// Creates allocation operation.		/// Creates allocation operation.
static Value createAllocation(OpBuilder &builder, Location loc, Type type,		static Value createAllocation(OpBuilder &builder, Location loc,
Value sz, bool enableInit) {		MemRefType memRefType, Value sz,
auto memType = MemRefType::get({ShapedType::kDynamic}, type);		bool enableInit) {
Value buffer = builder.create<memref::AllocOp>(loc, memType, sz);		Value buffer = builder.create<memref::AllocOp>(loc, memRefType, sz);
		Type elemType = memRefType.getElementType();
if (enableInit) {		if (enableInit) {
Value fillValue =		Value fillValue = builder.create<arith::ConstantOp>(
builder.create<arith::ConstantOp>(loc, type, builder.getZeroAttr(type));		loc, elemType, builder.getZeroAttr(elemType));
builder.create<linalg::FillOp>(loc, fillValue, buffer);		builder.create<linalg::FillOp>(loc, fillValue, buffer);
}		}
return buffer;		return buffer;
}		}

/// Creates allocation for each field in sparse tensor type. Note that		/// Creates allocation for each field in sparse tensor type. Note that
/// for all dynamic memrefs, the memory size is really the capacity of		/// for all dynamic memrefs, the memory size is really the capacity of
/// the "vector", while the actual size resides in the sizes array.		/// the "vector", while the actual size resides in the sizes array.
///		///
/// TODO: for efficiency, we will need heuristis to make educated guesses		/// TODO: for efficiency, we will need heuristis to make educated guesses
/// on the required capacities (see heuristic variable).		/// on the required capacities (see heuristic variable).
///		///
static void createAllocFields(OpBuilder &builder, Location loc, Type type,		static void createAllocFields(OpBuilder &builder, Location loc, Type type,
ValueRange dynSizes, bool enableInit,		ValueRange dynSizes, bool enableInit,
SmallVectorImpl<Value> &fields) {		SmallVectorImpl<Value> &fields) {
auto enc = getSparseTensorEncoding(type);
assert(enc);
// Construct the basic types.
unsigned idxWidth = enc.getIndexBitWidth();
unsigned ptrWidth = enc.getPointerBitWidth();
RankedTensorType rtp = type.cast<RankedTensorType>();		RankedTensorType rtp = type.cast<RankedTensorType>();
Type indexType = builder.getIndexType();
Type idxType = idxWidth ? builder.getIntegerType(idxWidth) : indexType;
Type ptrType = ptrWidth ? builder.getIntegerType(ptrWidth) : indexType;
Type eltType = rtp.getElementType();
auto shape = rtp.getShape();
unsigned rank = shape.size();
Value heuristic = constantIndex(builder, loc, 16);		Value heuristic = constantIndex(builder, loc, 16);

		SparseTensorDescriptor::foreachFieldInSparseTensor(
		aartbikUnsubmitted Not Done Reply Inline Actions note that this works for now, but we actually planned to implement a heuristic here, which will need to scan the ranks + level types again aartbik: note that this works for now, but we actually planned to implement a heuristic here, which will…
		PeimingAuthorUnsubmitted Done Reply Inline Actions Then, you can have another foreach before this to compute the heuristic. Peiming: Then, you can have another foreach before this to compute the heuristic.
		rtp,
		[&](Type fType, unsigned fIdx, SparseTensorDescriptor::FieldKind fKind,
		unsigned /dim/, DimLevelType /dlt/) -> bool {
		assert(fields.size() == fIdx);
		auto memRefTp = fType.cast<MemRefType>();
		Value field;
		switch (fKind) {
		case SparseTensorDescriptor::FieldKind::DimSizes:
		case SparseTensorDescriptor::FieldKind::MemSizes:
		field = builder.create<memref::AllocOp>(loc, memRefTp);
		break;
		case SparseTensorDescriptor::FieldKind::PtrMemRef:
		case SparseTensorDescriptor::FieldKind::IdxMemRef:
		case SparseTensorDescriptor::FieldKind::ValMemRef:
		field =
		createAllocation(builder, loc, memRefTp, heuristic, enableInit);
		break;
		}
		assert(field);
		fields.push_back(field);
		// Returns true to ontinue the iteration.
		return true;
		aartbikUnsubmitted Done Reply Inline Actions continue aartbik: continue
		});

		SparseTensorDescriptor desc(rtp, fields);

// Build original sizes.		// Build original sizes.
SmallVector<Value> sizes;		SmallVector<Value> sizes;
		auto shape = rtp.getShape();
		unsigned rank = shape.size();
for (unsigned r = 0, o = 0; r < rank; r++) {		for (unsigned r = 0, o = 0; r < rank; r++) {
if (ShapedType::isDynamic(shape[r]))		if (ShapedType::isDynamic(shape[r]))
sizes.push_back(dynSizes[o++]);		sizes.push_back(dynSizes[o++]);
else		else
sizes.push_back(constantIndex(builder, loc, shape[r]));		sizes.push_back(constantIndex(builder, loc, shape[r]));
}		}
// The dimSizes array and memSizes array.
unsigned lastField = getFieldIndex(type, -1u, -1u);
Value dimSizes =
builder.create<memref::AllocOp>(loc, MemRefType::get({rank}, indexType));
Value memSizes = builder.create<memref::AllocOp>(
loc, MemRefType::get({getMemSizesIndex(lastField)}, indexType));
fields.push_back(dimSizes);
fields.push_back(memSizes);
// Per-dimension storage.
for (unsigned r = 0; r < rank; r++) {
if (isCompressedDim(rtp, r)) {
fields.push_back(
createAllocation(builder, loc, ptrType, heuristic, enableInit));
fields.push_back(
createAllocation(builder, loc, idxType, heuristic, enableInit));
} else if (isSingletonDim(rtp, r)) {
fields.push_back(
createAllocation(builder, loc, idxType, heuristic, enableInit));
} else {
assert(isDenseDim(rtp, r)); // no fields
}
}
// The values array.
fields.push_back(
createAllocation(builder, loc, eltType, heuristic, enableInit));
assert(fields.size() == lastField);
// Initialize the storage scheme to an empty tensor. Initialized memSizes		// Initialize the storage scheme to an empty tensor. Initialized memSizes
// to all zeros, sets the dimSizes to known values and gives all pointer		// to all zeros, sets the dimSizes to known values and gives all pointer
// fields an initial zero entry, so that it is easier to maintain the		// fields an initial zero entry, so that it is easier to maintain the
// "linear + 1" length property.		// "linear + 1" length property.
builder.create<linalg::FillOp>(		builder.create<linalg::FillOp>(
loc, ValueRange{constantZero(builder, loc, indexType)},		loc, constantZero(builder, loc, builder.getIndexType()),
ValueRange{memSizes}); // zero memSizes		desc.getMemSizesMemRef()); // zero memSizes
Value ptrZero = constantZero(builder, loc, ptrType);
for (unsigned r = 0, field = fieldsIdx; r < rank; r++) {		Value ptrZero = constantZero(builder, loc, desc.getPtrElementType());
		for (unsigned r = 0; r < rank; r++) {
unsigned ro = toOrigDim(rtp, r);		unsigned ro = toOrigDim(rtp, r);
genStore(builder, loc, sizes[ro], dimSizes, constantIndex(builder, loc, r));		// Fills dim sizes array.
		genStore(builder, loc, sizes[ro], desc.getDimSizesMemRef(),
		constantIndex(builder, loc, r));

		// Pushes a leading zero to pointers memref.
if (isCompressedDim(rtp, r)) {		if (isCompressedDim(rtp, r)) {
createPushback(builder, loc, fields, field, ptrZero);		unsigned fid = desc.getPtrMemRefIndex(r);
field += 2;		createPushback(builder, loc, fields, fid, ptrZero);
} else if (isSingletonDim(rtp, r)) {
field += 1;
}		}
}		}
allocSchemeForRank(builder, loc, rtp, fields, fieldsIdx, /rank=/0);		allocSchemeForRank(builder, loc, rtp, fields,
		SparseTensorDescriptor::dataFieldIdx, /rank=/0);
}		}

/// Helper method that generates block specific to compressed case:		/// Helper method that generates block specific to compressed case:
///		///
/// plo = pointers[d][pos[d-1]]		/// plo = pointers[d][pos[d-1]]
/// phi = pointers[d][pos[d-1]+1]		/// phi = pointers[d][pos[d-1]+1]
/// msz = indices[d].size()		/// msz = indices[d].size()
/// if (plo < phi) {		/// if (plo < phi) {
Show All 18 Lines	static Value genCompressed(OpBuilder &builder, Location loc,
unsigned rank = rtp.getShape().size();		unsigned rank = rtp.getShape().size();
SmallVector<Type> types;		SmallVector<Type> types;
Type indexType = builder.getIndexType();		Type indexType = builder.getIndexType();
Type boolType = builder.getIntegerType(1);		Type boolType = builder.getIntegerType(1);
Value one = constantIndex(builder, loc, 1);		Value one = constantIndex(builder, loc, 1);
Value pp1 = builder.create<arith::AddIOp>(loc, pos, one);		Value pp1 = builder.create<arith::AddIOp>(loc, pos, one);
Value plo = genLoad(builder, loc, fields[field], pos);		Value plo = genLoad(builder, loc, fields[field], pos);
Value phi = genLoad(builder, loc, fields[field], pp1);		Value phi = genLoad(builder, loc, fields[field], pp1);
Value psz = constantIndex(builder, loc, getMemSizesIndex(field + 1));		Value psz = constantIndex(
Value msz = genLoad(builder, loc, fields[memSizesIdx], psz);		builder, loc, SparseTensorDescriptor::getFieldMemSizesIndex(field + 1));
		Value msz =
		genLoad(builder, loc, fields[SparseTensorDescriptor::memSizesIdx], psz);
Value phim1 = builder.create<arith::SubIOp>(		Value phim1 = builder.create<arith::SubIOp>(
loc, toType(builder, loc, phi, indexType), one);		loc, toType(builder, loc, phi, indexType), one);
// Conditional expression.		// Conditional expression.
Value lt =		Value lt =
builder.create<arith::CmpIOp>(loc, arith::CmpIPredicate::ult, plo, phi);		builder.create<arith::CmpIOp>(loc, arith::CmpIPredicate::ult, plo, phi);
types.push_back(boolType);		types.push_back(boolType);
scf::IfOp ifOp1 = builder.create<scf::IfOp>(loc, types, lt, /else/ true);		scf::IfOp ifOp1 = builder.create<scf::IfOp>(loc, types, lt, /else/ true);
types.pop_back();		types.pop_back();
builder.setInsertionPointToStart(&ifOp1.getThenRegion().front());		builder.setInsertionPointToStart(&ifOp1.getThenRegion().front());
Value crd = genLoad(builder, loc, fields[field + 1], phim1);		Value crd = genLoad(builder, loc, fields[field + 1], phim1);
Value eq = builder.create<arith::CmpIOp>(loc, arith::CmpIPredicate::eq,		Value eq = builder.create<arith::CmpIOp>(loc, arith::CmpIPredicate::eq,
toType(builder, loc, crd, indexType),		toType(builder, loc, crd, indexType),
indices[d]);		indices[d]);
builder.create<scf::YieldOp>(loc, eq);		builder.create<scf::YieldOp>(loc, eq);
builder.setInsertionPointToStart(&ifOp1.getElseRegion().front());		builder.setInsertionPointToStart(&ifOp1.getElseRegion().front());
if (d > 0)		if (d > 0)
genStore(builder, loc, msz, fields[field], pos);		genStore(builder, loc, msz, fields[field], pos);
builder.create<scf::YieldOp>(loc, constantI1(builder, loc, false));		builder.create<scf::YieldOp>(loc, constantI1(builder, loc, false));
builder.setInsertionPointAfter(ifOp1);		builder.setInsertionPointAfter(ifOp1);
Value p = ifOp1.getResult(0);		Value p = ifOp1.getResult(0);
// If present construct. Note that for a non-unique dimension level, we simply		// If present construct. Note that for a non-unique dimension level, we
// set the condition to false and rely on CSE/DCE to clean up the IR.		// simply set the condition to false and rely on CSE/DCE to clean up the IR.
//		//
// TODO: generate less temporary IR?		// TODO: generate less temporary IR?
//		//
for (unsigned i = 0, e = fields.size(); i < e; i++)		for (unsigned i = 0, e = fields.size(); i < e; i++)
types.push_back(fields[i].getType());		types.push_back(fields[i].getType());
types.push_back(indexType);		types.push_back(indexType);
if (!isUniqueDim(rtp, d))		if (!isUniqueDim(rtp, d))
p = constantI1(builder, loc, false);		p = constantI1(builder, loc, false);
scf::IfOp ifOp2 = builder.create<scf::IfOp>(loc, types, p, /else/ true);		scf::IfOp ifOp2 = builder.create<scf::IfOp>(loc, types, p, /else/ true);
// If present (fields unaffected, update next to phim1).		// If present (fields unaffected, update next to phim1).
builder.setInsertionPointToStart(&ifOp2.getThenRegion().front());		builder.setInsertionPointToStart(&ifOp2.getThenRegion().front());
fields.push_back(phim1);		fields.push_back(phim1);
builder.create<scf::YieldOp>(loc, fields);		builder.create<scf::YieldOp>(loc, fields);
fields.pop_back();		fields.pop_back();
// If !present (changes fields, update next).		// If !present (changes fields, update next).
builder.setInsertionPointToStart(&ifOp2.getElseRegion().front());		builder.setInsertionPointToStart(&ifOp2.getElseRegion().front());
Value mszp1 = builder.create<arith::AddIOp>(loc, msz, one);		Value mszp1 = builder.create<arith::AddIOp>(loc, msz, one);
		aartbikUnsubmitted Not Done Reply Inline Actions yeah, agreed, this made more sense in the original but the alternative is to copy and create a fully new array.... aartbik: yeah, agreed, this made more sense in the original but the alternative is to copy and create a…
genStore(builder, loc, mszp1, fields[field], pp1);		genStore(builder, loc, mszp1, fields[field], pp1);
createPushback(builder, loc, fields, field + 1, indices[d]);		createPushback(builder, loc, fields, field + 1, indices[d]);
// Prepare the next dimension "as needed".		// Prepare the next dimension "as needed".
if ((d + 1) < rank)		if ((d + 1) < rank)
allocSchemeForRank(builder, loc, rtp, fields, field + 2, d + 1);		allocSchemeForRank(builder, loc, rtp, fields, field + 2, d + 1);
fields.push_back(msz);		fields.push_back(msz);
builder.create<scf::YieldOp>(loc, fields);		builder.create<scf::YieldOp>(loc, fields);
fields.pop_back();		fields.pop_back();
Show All 16 Lines
///		///
/// TODO: better unord/not-unique; also generalize, optimize, specialize!		/// TODO: better unord/not-unique; also generalize, optimize, specialize!
///		///
static void genInsert(OpBuilder &builder, Location loc, RankedTensorType rtp,		static void genInsert(OpBuilder &builder, Location loc, RankedTensorType rtp,
SmallVectorImpl<Value> &fields,		SmallVectorImpl<Value> &fields,
SmallVectorImpl<Value> &indices, Value value) {		SmallVectorImpl<Value> &indices, Value value) {
unsigned rank = rtp.getShape().size();		unsigned rank = rtp.getShape().size();
assert(rank == indices.size());		assert(rank == indices.size());
unsigned field = fieldsIdx; // start past header		SparseTensorDescriptor desc(rtp, fields);
Value pos = constantZero(builder, loc, builder.getIndexType());		Value pos = constantZero(builder, loc, builder.getIndexType());
// Generate code for every dimension.		// Generate code for every dimension.
for (unsigned d = 0; d < rank; d++) {		for (unsigned d = 0; d < rank; d++) {
if (isCompressedDim(rtp, d)) {		if (isCompressedDim(rtp, d)) {
// Create:		// Create:
// if (!present) {		// if (!present) {
// indices[d].push_back(i[d])		// indices[d].push_back(i[d])
// <update pointers and prepare dimension d + 1>		// <update pointers and prepare dimension d + 1>
// }		// }
// pos[d] = indices.size() - 1		// pos[d] = indices.size() - 1
// <insert @ pos[d] at next dimension d + 1>		// <insert @ pos[d] at next dimension d + 1>
pos = genCompressed(builder, loc, rtp, fields, indices, value, pos, field,		pos = genCompressed(builder, loc, rtp, fields, indices, value, pos,
d);		desc.getPtrMemRefIndex(d), d);
field += 2;
} else if (isSingletonDim(rtp, d)) {		} else if (isSingletonDim(rtp, d)) {
// Create:		// Create:
// indices[d].push_back(i[d])		// indices[d].push_back(i[d])
// pos[d] = pos[d-1]		// pos[d] = pos[d-1]
// <insert @ pos[d] at next dimension d + 1>		// <insert @ pos[d] at next dimension d + 1>
createPushback(builder, loc, fields, field, indices[d]);		unsigned fidx = desc.getIdxMemRefIndex(d);
field += 1;		createPushback(builder, loc, fields, fidx, indices[d]);
} else {		} else {
assert(isDenseDim(rtp, d));		assert(isDenseDim(rtp, d));
// Construct the new position as:		// Construct the new position as:
// pos[d] = size * pos[d-1] + i[d]		// pos[d] = size * pos[d-1] + i[d]
// <insert @ pos[d] at next dimension d + 1>		// <insert @ pos[d] at next dimension d + 1>
Value size = sizeAtStoredDim(builder, loc, rtp, fields, d);		Value size = sizeAtStoredDim(builder, loc, rtp, fields, d);
Value mult = builder.create<arith::MulIOp>(loc, size, pos);		Value mult = builder.create<arith::MulIOp>(loc, size, pos);
pos = builder.create<arith::AddIOp>(loc, mult, indices[d]);		pos = builder.create<arith::AddIOp>(loc, mult, indices[d]);
}		}
}		}
// Reached the actual value append/insert.		// Reached the actual value append/insert.
if (!isDenseDim(rtp, rank - 1))		unsigned valIdx = desc.getValMemRefIndex();
createPushback(builder, loc, fields, field++, value);		if (!isDenseDim(rtp, rank - 1)) {
else		createPushback(builder, loc, fields, valIdx, value);
genStore(builder, loc, value, fields[field++], pos);		} else
assert(fields.size() == field);		genStore(builder, loc, value, desc.getValMemRef(), pos);
		// assert(fields.size() == field);
}		}

/// Generations insertion finalization code.		/// Generations insertion finalization code.
static void genEndInsert(OpBuilder &builder, Location loc, RankedTensorType rtp,		static void genEndInsert(OpBuilder &builder, Location loc, RankedTensorType rtp,
SmallVectorImpl<Value> &fields) {		SmallVectorImpl<Value> &fields) {
unsigned rank = rtp.getShape().size();		unsigned rank = rtp.getShape().size();
unsigned field = fieldsIdx; // start past header		unsigned field = SparseTensorDescriptor::dataFieldIdx; // start past header
for (unsigned d = 0; d < rank; d++) {		for (unsigned d = 0; d < rank; d++) {
if (isCompressedDim(rtp, d)) {		if (isCompressedDim(rtp, d)) {
// Compressed dimensions need a pointer cleanup for all entries		// Compressed dimensions need a pointer cleanup for all entries
// that were not visited during the insertion pass.		// that were not visited during the insertion pass.
//		//
// TODO: avoid cleanup and keep compressed scheme consistent at all times?		// TODO: avoid cleanup and keep compressed scheme consistent at all
		// times?
//		//
if (d > 0) {		if (d > 0) {
unsigned ptrWidth = getSparseTensorEncoding(rtp).getPointerBitWidth();		unsigned ptrWidth = getSparseTensorEncoding(rtp).getPointerBitWidth();
Type indexType = builder.getIndexType();		Type indexType = builder.getIndexType();
Type ptrType = ptrWidth ? builder.getIntegerType(ptrWidth) : indexType;		Type ptrType = ptrWidth ? builder.getIntegerType(ptrWidth) : indexType;
Value mz = constantIndex(builder, loc, getMemSizesIndex(field));		Value mz = constantIndex(
Value hi = genLoad(builder, loc, fields[memSizesIdx], mz);		builder, loc, SparseTensorDescriptor::getFieldMemSizesIndex(field));
		Value hi = genLoad(builder, loc,
		fields[SparseTensorDescriptor::memSizesIdx], mz);
Value zero = constantIndex(builder, loc, 0);		Value zero = constantIndex(builder, loc, 0);
Value one = constantIndex(builder, loc, 1);		Value one = constantIndex(builder, loc, 1);
// Vector of only one, but needed by createFor's prototype.		// Vector of only one, but needed by createFor's prototype.
SmallVector<Value, 1> inits{genLoad(builder, loc, fields[field], zero)};		SmallVector<Value, 1> inits{genLoad(builder, loc, fields[field], zero)};
scf::ForOp loop = createFor(builder, loc, hi, inits, one);		scf::ForOp loop = createFor(builder, loc, hi, inits, one);
Value i = loop.getInductionVar();		Value i = loop.getInductionVar();
Value oldv = loop.getRegionIterArg(0);		Value oldv = loop.getRegionIterArg(0);
Value newv = genLoad(builder, loc, fields[field], i);		Value newv = genLoad(builder, loc, fields[field], i);
▲ Show 20 Lines • Show All 363 Lines • ▼ Show 20 Lines	public:
using OpAdaptor = typename SourceOp::Adaptor;		using OpAdaptor = typename SourceOp::Adaptor;
using OpConversionPattern<SourceOp>::OpConversionPattern;		using OpConversionPattern<SourceOp>::OpConversionPattern;
LogicalResult		LogicalResult
matchAndRewrite(SourceOp op, OpAdaptor adaptor,		matchAndRewrite(SourceOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
// Replace the requested pointer access with corresponding field.		// Replace the requested pointer access with corresponding field.
// The cast_op is inserted by type converter to intermix 1:N type		// The cast_op is inserted by type converter to intermix 1:N type
// conversion.		// conversion.
auto tuple = getTuple(adaptor.getTensor());		auto desc = getDescriptorFromTensorTuple(adaptor.getTensor());
unsigned idx = Base::getIndexForOp(tuple, op);		Value field = Base::getFieldForOp(desc, op);
auto fields = tuple.getInputs();		rewriter.replaceOp(op, field);
assert(idx < fields.size());
rewriter.replaceOp(op, fields[idx]);
return success();		return success();
}		}
};		};

/// Sparse codegen rule for pointer accesses.		/// Sparse codegen rule for pointer accesses.
class SparseToPointersConverter		class SparseToPointersConverter
: public SparseGetterOpConverter<ToPointersOp, SparseToPointersConverter> {		: public SparseGetterOpConverter<ToPointersOp, SparseToPointersConverter> {
public:		public:
using SparseGetterOpConverter::SparseGetterOpConverter;		using SparseGetterOpConverter::SparseGetterOpConverter;
// Callback for SparseGetterOpConverter.		// Callback for SparseGetterOpConverter.
static unsigned getIndexForOp(UnrealizedConversionCastOp /tuple/,		static Value getFieldForOp(const SparseTensorDescriptor &desc,
ToPointersOp op) {		ToPointersOp op) {
uint64_t dim = op.getDimension().getZExtValue();		uint64_t dim = op.getDimension().getZExtValue();
return getFieldIndex(op.getTensor().getType(), /ptrDim=/dim, -1u);		return desc.getPtrMemRef(dim);
}		}
};		};

/// Sparse codegen rule for index accesses.		/// Sparse codegen rule for index accesses.
class SparseToIndicesConverter		class SparseToIndicesConverter
: public SparseGetterOpConverter<ToIndicesOp, SparseToIndicesConverter> {		: public SparseGetterOpConverter<ToIndicesOp, SparseToIndicesConverter> {
public:		public:
using SparseGetterOpConverter::SparseGetterOpConverter;		using SparseGetterOpConverter::SparseGetterOpConverter;
// Callback for SparseGetterOpConverter.		// Callback for SparseGetterOpConverter.
static unsigned getIndexForOp(UnrealizedConversionCastOp /tuple/,		static Value getFieldForOp(const SparseTensorDescriptor &desc,
ToIndicesOp op) {		ToIndicesOp op) {
uint64_t dim = op.getDimension().getZExtValue();		uint64_t dim = op.getDimension().getZExtValue();
return getFieldIndex(op.getTensor().getType(), -1u, /idxDim=/dim);		return desc.getIdxMemRef(dim);
}		}
};		};

/// Sparse codegen rule for value accesses.		/// Sparse codegen rule for value accesses.
class SparseToValuesConverter		class SparseToValuesConverter
: public SparseGetterOpConverter<ToValuesOp, SparseToValuesConverter> {		: public SparseGetterOpConverter<ToValuesOp, SparseToValuesConverter> {
public:		public:
using SparseGetterOpConverter::SparseGetterOpConverter;		using SparseGetterOpConverter::SparseGetterOpConverter;
// Callback for SparseGetterOpConverter.		// Callback for SparseGetterOpConverter.
static unsigned getIndexForOp(UnrealizedConversionCastOp tuple,		static Value getFieldForOp(const SparseTensorDescriptor &desc,
ToValuesOp /op/) {		ToValuesOp /op/) {
// The last field holds the value buffer.		return desc.getValMemRef();
return tuple.getInputs().size() - 1;
}		}
};		};

/// Sparse codegen rule for the convert operator.		/// Sparse codegen rule for the convert operator.
class SparseConvertConverter : public OpConversionPattern<ConvertOp> {		class SparseConvertConverter : public OpConversionPattern<ConvertOp> {
public:		public:
using OpConversionPattern::OpConversionPattern;		using OpConversionPattern::OpConversionPattern;
LogicalResult		LogicalResult
Show All 15 Lines
class SparseNumberOfEntriesConverter		class SparseNumberOfEntriesConverter
: public OpConversionPattern<NumberOfEntriesOp> {		: public OpConversionPattern<NumberOfEntriesOp> {
public:		public:
using OpConversionPattern::OpConversionPattern;		using OpConversionPattern::OpConversionPattern;
LogicalResult		LogicalResult
matchAndRewrite(NumberOfEntriesOp op, OpAdaptor adaptor,		matchAndRewrite(NumberOfEntriesOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
// Query memSizes for the actually stored values size.		// Query memSizes for the actually stored values size.
auto tuple = getTuple(adaptor.getTensor());		auto desc = getDescriptorFromTensorTuple(adaptor.getTensor());
auto fields = tuple.getInputs();
unsigned lastField = fields.size() - 1;
Value field =		Value field =
constantIndex(rewriter, op.getLoc(), getMemSizesIndex(lastField));		constantIndex(rewriter, op.getLoc(), desc.getValMemSizesIndex());
rewriter.replaceOpWithNewOp<memref::LoadOp>(op, fields[memSizesIdx], field);		rewriter.replaceOpWithNewOp<memref::LoadOp>(op, desc.getMemSizesMemRef(),
		field);
return success();		return success();
}		}
};		};

} // namespace		} // namespace

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Sparse tensor type conversion into an actual buffer.		// Sparse tensor type conversion into an actual buffer.
Show All 38 Lines