This is an archive of the discontinued LLVM Phabricator instance.

mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h
46	I think it'd be better to define `using TensorLevel = unsigned;` and compress things like we do for `TensorLoopId`.
239	I think you should just define the `struct` directly, rather than using `std::tuple`. That way client code can use nice names for accessing the fields directly, rather than defining all those wrappers around `std::get<N>`. Also, the struct should be named `LoopSliceInfo` for consistency with `LoopInfo`, `SliceInfo`, etc. And, of course, the struct should be `final`. If you are concerned about the ability to use structured-binding in the foreach-loops, then take a look at the "Case 3: binding to data members" section of https://en.cppreference.com/w/cpp/language/structured_binding . tl;dr: the syntax/semantics should be exactly the same as for the `std::tuple` case.
248	I think the name `tidLvls` would be a lot cleaner here, and consistent with elsewhere.
253–256	I think it'd be best to leave this as "operating on" like it was before. Although this is "iterating" in the sparse-compiler sense of enumerating the stored values of sparse-tensors, I feel that using that term here would cause confusion vs the C++ iterator framework
255	Grammatically, that should be "information" (singular, and lowercase). However, I think it'd be better to leave it as "slice-driven loop conditions" like it was before, since the term "information" doesn't really tell us very much about what exactly that means/contains.
mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorRewriting.cpp
954–956	Why keep this defn of `ld` around but commented out? If it's no longer needed, then should remove it (along with my fixme comment about trying to figure out what it's actually supposed to be)
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1304–1308	should be "tidLvl" to match the names used elsewhere
1305	It seems worth adding a comment explaining why we discard `tlPair.second` and use `ldx` instead; i.e., a comment saying that `isSparse` is checking whether any tensor is sparse for the loop `ldx`, regardless of what level that corresponds to and regardless of what level is stored in `tidLvls`. (Or, if the levels in `tidLvls` are supposed to agree with the loop, then say that instead. Assuming we actually maintain that invariant, then we don't need to add assertions to verify that; but we do want a comment explaining that this invariant holds.)
1551	Why `condTidLvls` instead of just `tidLvls`? (I'm not saying it should be changed, just curious why the different name)
1654	I think the `affineTidLvls` and `affines` should actually be combined into a single `SmallVector<TLA>` where `struct TLA final { TensorId; Level; AffineExpr }` (or `struct TLA final {TensorLevel; AffineExpr}` with the appropriate getters). That helps capture the invariant that `translateBitsToTidLvlPairs` keeps the the same length, and thus also helps avoid needing to `zip` them together later on Of course, you'll need to come up with a better name than my "TLA" ;)

Peiming added inline comments.Apr 17 2023, 2:23 PM

mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h
239	Actually I delete the wrappers, I find them useless. We only use in in structure-bindings in foreach loop, and this is a private type so it should be okay? WDYT?

address some comments.

Peiming added inline comments.Apr 17 2023, 2:39 PM

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1551	Because the callsites was using `condTid` (to distinguish from `affineTid`). but yeah, `cond-` is not a accurate prefix, since not all `tid, lvl` are appeared in the loop condition (most of the dense levels are not). I changed it to `tidLvls`

wrengr added inline comments.Apr 17 2023, 3:00 PM

mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h
239	I still think we should define our own struct rather than reusing `std::tuple`. I feel like the `std::pair`/`std::tuple` classes are really only intended to be used for those cases where folks need an ad-hoc struct; hence, whenever things aren't ad-hoc then they should be given their own definitions.

Harbormaster completed remote builds in B226227: Diff 514413.Apr 17 2023, 3:54 PM

use single unsigned for tensor lvl pairs

Peiming marked an inline comment as done.Apr 17 2023, 5:01 PM

Harbormaster completed remote builds in B226262: Diff 514462.Apr 17 2023, 5:36 PM

aartbik accepted this revision.Apr 20 2023, 2:00 PM

aartbik added inline comments.

mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h
68	please put this empty line back, all Section // comments have whitespace around them
90	empty line
92	agree (since the subject is plural)
mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h
194	Gets
208	Give this one a doc string too

This revision is now accepted and ready to land.Apr 20 2023, 2:00 PM

address comments.

Peiming added inline comments.Apr 20 2023, 2:38 PM

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1654	I want to keep it, because the `affineTidLvls` are used independently from `affines` later. See L1662

Harbormaster completed remote builds in B226985: Diff 515491.Apr 20 2023, 3:04 PM

wrengr added inline comments.Apr 21 2023, 12:00 PM

mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h
92	This should be `TensorId`
92	This should be `Level`
mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.cpp
764	It's more appropriate to use `push_back` here, since the return type of `makeTensorLevel` is exactly what's being stored (i.e., it's not being passed to some constructor for what's actually stored)
770	ditto, should be `push_back`
mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h
203	I think a name like `unpackTensorLevel` would better match `makeTensorLevel`.
208	This should use "///" since it's documentation not a comment
208	I get what you're after by saying "`Container<>`" and "`range<>`", however since those are not actual names of templates, it'd be better to rephrase this to "Converts a range of `TensorLevel` to a range of `std::pair<TensorId, Level>`."
210	I think you'll want to add `std::remove_const_t` there too, since that's what all the analogous sfinae code in MLIR does for this sort of thing.
214	Mirroring the earlier suggestion, this would then be `unpackTensorLevelRange`
216	this should be `std::forward<ContainerTy>(c)`
239	Should be `final`.
240–242	style-guide says that the fields should be at the end of the defn
253–256	Should be just "tensor"
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1495	`push_back`
1565	`push_back`
1583	`push_back`
1586	`push_back`
1624	`push_back`
1636	`push_back`
1662–1663	I still think it'd be better to have a single `SmallVector<std::pair<TensorLevel, AffineExpr>>` since that helps capture the invariant that these should be the same length. But if you're not going to do that, then you should use `llvm::zip_equal` so that it checks that they have the same length rather than silently truncating things to the shorter list
1669–1671	If this concat is what's making you reluctant to use `SmallVector<std::pair<TensorLevel, AffineExpr>>`, then recall that you can use `llvm::concat<TensorLevel>(tidLvls, llvm::make_first_range(affineTidLvlExprs))`.

address comments.

Peiming added inline comments.Apr 21 2023, 1:22 PM

mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h
240–242	Good to know!
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1662–1663	Good to know there is a `zip_equal`.

Use a pair for affineTidLvls and AffineExpr.

minor improvement.

wrengr accepted this revision.Apr 21 2023, 2:00 PM

wrengr added inline comments.

mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h
210	it should be in the order `remove_reference_t<remove_const_t<`, at least that's the order I see elsewhere, haven't checked to see if/how it matters

Peiming marked an inline comment as done.Apr 21 2023, 2:19 PM

Peiming added inline comments.

mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h
210	I don't think it matters, but yeah, good to be consistent anyway.

Peiming marked 2 inline comments as done.Apr 21 2023, 2:20 PM

Harbormaster completed remote builds in B227316: Diff 515903.Apr 21 2023, 2:24 PM

Peiming added inline comments.Apr 21 2023, 2:26 PM

mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h
210	For the record, the order matters https://stackoverflow.com/questions/30690269/why-does-using-stdremove-reference-and-stdremove-const-in-different-order-pr

using llvm::remove_cvref_t

rebase.

This revision was landed with ongoing or failed builds.May 4 2023, 9:15 AM

Closed by commit rG36c95ee739c0: [mlir][sparse] group tensor id and levels into pairs in loop emitter (authored by Peiming). · Explain Why

This revision was automatically updated to reflect the committed changes.

Peiming added a commit: rG36c95ee739c0: [mlir][sparse] group tensor id and levels into pairs in loop emitter.

Harbormaster completed remote builds in B230002: Diff 519520.May 4 2023, 9:28 AM

vzakhari mentioned this in D142930: [mlir][sparse] extend loop emitter to emit slice driven loops.May 4 2023, 10:05 AM

@Peiming can you please check, the build is failing. Pasting the link to premerge-check https://buildkite.com/llvm-project/premerge-checks/builds/150384#0187e78d-a047-440e-9018-0eecabff91c8

In file included from /home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h:17,
                 from /home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp:13:
/home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h: In member function ‘constexpr mlir::sparse_tensor::TensorLevel mlir::sparse_tensor::LoopEmitter::makeTensorLevel(mlir::sparse_tensor::TensorId, mlir::sparse_tensor::Level) const’:
/home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h:199:29: error: call to non-‘constexpr’ function ‘unsigned int mlir::sparse_tensor::LoopEmitter::getNumTensors() const’
  199 |     return l * getNumTensors() + t;
      |                ~~~~~~~~~~~~~^~
/home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h:195:12: note: ‘unsigned int mlir::sparse_tensor::LoopEmitter::getNumTensors() const’ declared here
  195 |   unsigned getNumTensors() const { return tensors.size(); }
      |            ^~~~~~~~~~~~~
In file included from /usr/include/c++/11/cassert:44,
                 from /home/scratch/llvm-project/llvm/include/llvm/Support/CommandLine.h:34,
                 from /home/scratch/llvm-project/mlir/include/mlir/Pass/PassOptions.h:21,
                 from /home/scratch/llvm-project/mlir/include/mlir/Pass/PassRegistry.h:17,
                 from /home/scratch/llvm-project/mlir/include/mlir/Pass/Pass.h:13,
                 from /home/scratch/llvm-project/mlir/include/mlir/Dialect/SparseTensor/Transforms/Passes.h:21,
                 from /home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h:21,
                 from /home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp:13:
/home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h: In member function ‘constexpr mlir::sparse_tensor::TensorLevel mlir::sparse_tensor::CodegenEnv::makeTensorLevel(mlir::sparse_tensor::TensorId, mlir::sparse_tensor::Level) const’:
/home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h:95:37: error: call to non-‘constexpr’ function ‘unsigned int mlir::sparse_tensor::LoopEmitter::getNumTensors() const’
   95 |     assert(loopEmitter.getNumTensors() == linalgOp->getNumOperands() &&
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~^~
In file included from /home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h:17

In D148565#4319356, @chaitanyav wrote:

@Peiming can you please check, the build is failing. Pasting the link to premerge-check https://buildkite.com/llvm-project/premerge-checks/builds/150384#0187e78d-a047-440e-9018-0eecabff91c8

In file included from /home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h:17,
                 from /home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp:13:
/home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h: In member function ‘constexpr mlir::sparse_tensor::TensorLevel mlir::sparse_tensor::LoopEmitter::makeTensorLevel(mlir::sparse_tensor::TensorId, mlir::sparse_tensor::Level) const’:
/home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h:199:29: error: call to non-‘constexpr’ function ‘unsigned int mlir::sparse_tensor::LoopEmitter::getNumTensors() const’
  199 |     return l * getNumTensors() + t;
      |                ~~~~~~~~~~~~~^~
/home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h:195:12: note: ‘unsigned int mlir::sparse_tensor::LoopEmitter::getNumTensors() const’ declared here
  195 |   unsigned getNumTensors() const { return tensors.size(); }
      |            ^~~~~~~~~~~~~
In file included from /usr/include/c++/11/cassert:44,
                 from /home/scratch/llvm-project/llvm/include/llvm/Support/CommandLine.h:34,
                 from /home/scratch/llvm-project/mlir/include/mlir/Pass/PassOptions.h:21,
                 from /home/scratch/llvm-project/mlir/include/mlir/Pass/PassRegistry.h:17,
                 from /home/scratch/llvm-project/mlir/include/mlir/Pass/Pass.h:13,
                 from /home/scratch/llvm-project/mlir/include/mlir/Dialect/SparseTensor/Transforms/Passes.h:21,
                 from /home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h:21,
                 from /home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp:13:
/home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h: In member function ‘constexpr mlir::sparse_tensor::TensorLevel mlir::sparse_tensor::CodegenEnv::makeTensorLevel(mlir::sparse_tensor::TensorId, mlir::sparse_tensor::Level) const’:
/home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h:95:37: error: call to non-‘constexpr’ function ‘unsigned int mlir::sparse_tensor::LoopEmitter::getNumTensors() const’
   95 |     assert(loopEmitter.getNumTensors() == linalgOp->getNumOperands() &&
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~^~
In file included from /home/scratch/llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h:17

On it

vzakhari added a subscriber: vzakhari.May 4 2023, 10:25 AM

vzakhari added inline comments.

mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h
92	`getNumTensors` cannot be called from `constexpr` function.

vzakhari mentioned this in D149874: [mlir][sparse] fix build error..May 4 2023, 10:26 AM

Revision Contents

Path

Size

mlir/

lib/

Dialect/

SparseTensor/

Transforms/

CodegenEnv.h

15 lines

LoopEmitter.h

74 lines

LoopEmitter.cpp

99 lines

SparseTensorRewriting.cpp

10 lines

Sparsification.cpp

89 lines

Diff 515486

mlir/lib/Dialect/SparseTensor/Transforms/CodegenEnv.h

Show First 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	public:
std::optional<Operation *>		std::optional<Operation *>
genLoopBoundary(function_ref<		genLoopBoundary(function_ref<
std::optional<Operation *>(MutableArrayRef<Value> parameters)>		std::optional<Operation *>(MutableArrayRef<Value> parameters)>
callback);		callback);

//		//
// Merger delegates.		// Merger delegates.
//		//

aartbikUnsubmitted Done Reply Inline Actions please put this empty line back, all Section // comments have whitespace around them aartbik: please put this empty line back, all // // Section // comments have whitespace around them
constexpr TensorId makeTensorId(unsigned t) const {		constexpr TensorId makeTensorId(unsigned t) const {
return latticeMerger.makeTensorId(t);		return latticeMerger.makeTensorId(t);
}		}
constexpr LoopId makeLoopId(unsigned i) const {		constexpr LoopId makeLoopId(unsigned i) const {
return latticeMerger.makeLoopId(i);		return latticeMerger.makeLoopId(i);
}		}
constexpr TensorLoopId makeTensorLoopId(unsigned t, unsigned i) const {		constexpr TensorLoopId makeTensorLoopId(unsigned t, unsigned i) const {
return latticeMerger.makeTensorLoopId(t, i);		return latticeMerger.makeTensorLoopId(t, i);
}		}
const TensorExp &exp(ExprId e) const { return latticeMerger.exp(e); }		const TensorExp &exp(ExprId e) const { return latticeMerger.exp(e); }
const LatPoint &lat(LatPointId l) const { return latticeMerger.lat(l); }		const LatPoint &lat(LatPointId l) const { return latticeMerger.lat(l); }
ArrayRef<LatPointId> set(LatSetId s) const { return latticeMerger.set(s); }		ArrayRef<LatPointId> set(LatSetId s) const { return latticeMerger.set(s); }
DimLevelType dlt(TensorId t, LoopId i) const {		DimLevelType dlt(TensorId t, LoopId i) const {
return latticeMerger.getDimLevelType(t, i);		return latticeMerger.getDimLevelType(t, i);
}		}
DimLevelType dlt(TensorLoopId b) const {		DimLevelType dlt(TensorLoopId b) const {
return latticeMerger.getDimLevelType(b);		return latticeMerger.getDimLevelType(b);
}		}

//		//
		// LoopEmitter delegates.
		//
		aartbikUnsubmitted Done Reply Inline Actions empty line aartbik: empty line

		constexpr TensorLevel makeTensorLevel(unsigned t, unsigned l) const {
		aartbikUnsubmitted Done Reply Inline Actions agree (since the subject is plural) aartbik: agree (since the subject is plural)
		wrengrUnsubmitted Done Reply Inline Actions This should be `TensorId` wrengr: This should be `TensorId`
		wrengrUnsubmitted Done Reply Inline Actions This should be `Level` wrengr: This should be `Level`
		vzakhariUnsubmitted Not Done Reply Inline Actions `getNumTensors` cannot be called from `constexpr` function. vzakhari: `getNumTensors` cannot be called from `constexpr` function.
		// Make sure LoopEmitter, GenericOp, and Merger agree on the number of
		// tensors. Merger has one more synthetic tensor for loop invariants.
		assert(loopEmitter.getNumTensors() == linalgOp->getNumOperands() &&
		loopEmitter.getNumTensors() == latticeMerger.getNumTensors() - 1);
		return loopEmitter.makeTensorLevel(t, l);
		}
		std::pair<TensorId, Level> toTidLvlPair(TensorLevel tl) const {
		return loopEmitter.toTidLvlPair(tl);
		}

		//
// Code generation environment verify functions.		// Code generation environment verify functions.
//		//

/// Whether the tensor expression is admissible for codegen.		/// Whether the tensor expression is admissible for codegen.
/// It also sets the sparseOut if the output tensor is sparse.		/// It also sets the sparseOut if the output tensor is sparse.
bool isAdmissibleTensorExp(ExprId e);		bool isAdmissibleTensorExp(ExprId e);

/// Whether the iteration graph is sorted in admissible topoOrder.		/// Whether the iteration graph is sorted in admissible topoOrder.
▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h

Show All 36 Lines
/// In addition `LoopEmitter::genAffine` has `AffineDimExpr::position`		/// In addition `LoopEmitter::genAffine` has `AffineDimExpr::position`
/// correspond to `LoopId`, however it is unclear what the providence		/// correspond to `LoopId`, however it is unclear what the providence
/// of those `AffineDimExpr` is.		/// of those `AffineDimExpr` is.
//		//
// TODO: use a struct/class rather than a typedef, so that we can actually		// TODO: use a struct/class rather than a typedef, so that we can actually
// typecheck this to avoid mixups in the code.		// typecheck this to avoid mixups in the code.
using LoopOrd = unsigned;		using LoopOrd = unsigned;

		// A compressed <tensor id, level> pair.
		using TensorLevel = unsigned;
		wrengrUnsubmitted Done Reply Inline Actions I think it'd be better to define `using TensorLevel = unsigned;` and compress things like we do for `TensorLoopId`. wrengr: I think it'd be better to define `using TensorLevel = unsigned;` and compress things like we do…
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// SparseTensorLoopEmiter class, manages sparse tensors and helps to		// SparseTensorLoopEmiter class, manages sparse tensors and helps to
// generate loop structure to (co)-iterate sparse tensors.		// generate loop structure to (co)-iterate sparse tensors.
//		//
// An example usage:		// An example usage:
// To generate the following loops over T1<?x?> and T2<?x?>		// To generate the following loops over T1<?x?> and T2<?x?>
//		//
// for i in TENSOR_1_0 {		// for i in TENSOR_1_0 {
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	public:
/// break p0		/// break p0
///		///
/// // Starts loop from p0		/// // Starts loop from p0
/// for (i = p0; i < end; i++)		/// for (i = p0; i < end; i++)
/// ...		/// ...
/// // loop sequence end.		/// // loop sequence end.
/// }		/// }
void enterNewLoopSeq(OpBuilder &builder, Location loc,		void enterNewLoopSeq(OpBuilder &builder, Location loc,
ArrayRef<TensorId> tids, ArrayRef<Level> lvls);		ArrayRef<TensorLevel> tidLvls);

/// Exits the current loop sequence, this will reset universal index to 0.		/// Exits the current loop sequence, this will reset universal index to 0.
void exitCurrentLoopSeq(OpBuilder &builder, Location loc);		void exitCurrentLoopSeq(OpBuilder &builder, Location loc);

// TODO: Get rid of `lvls` in the argument list? Track the level we		// TODO: Get rid of `lvls` in the argument list? Track the level we
// are currently at internally. Then it would be enterNextLvlForTensor.		// are currently at internally. Then it would be enterNextLvlForTensor.
// Still need a way to specify the lvl for non-annotated tensors though,		// Still need a way to specify the lvl for non-annotated tensors though,
// as those can be accessed out of order.		// as those can be accessed out of order.
//		//
/// Emits loop over tensor_tid_lvl, it assumes that loops between		/// Emits loop over tensor_tid_lvl, it assumes that loops between
/// tensor_tid_[0, lvl - 1] have already been generated.		/// tensor_tid_[0, lvl - 1] have already been generated.
/// The function will also perform in-place update on the `reduc` vector to		/// The function will also perform in-place update on the `reduc` vector to
/// return the reduction variable used inside the generated loop.		/// return the reduction variable used inside the generated loop.
Operation *enterLoopOverTensorAtLvl(OpBuilder &builder, Location loc,		Operation *enterLoopOverTensorAtLvl(OpBuilder &builder, Location loc,
ArrayRef<TensorId> tids,		ArrayRef<TensorLevel> tidLvls,
ArrayRef<Level> lvls,
MutableArrayRef<Value> reduc = {},		MutableArrayRef<Value> reduc = {},
bool isParallel = false);		bool isParallel = false);

Operation *enterFilterLoopOverTensorAtLvl(OpBuilder &builder, Location loc,		Operation *enterFilterLoopOverTensorAtLvl(OpBuilder &builder, Location loc,
TensorId tid, Level lvl,		TensorId tid, Level lvl,
AffineExpr affine,		AffineExpr affine,
MutableArrayRef<Value> reduc = {});		MutableArrayRef<Value> reduc = {});

void genDenseAffineAddress(OpBuilder &builder, Location loc, TensorId tid,		void genDenseAffineAddress(OpBuilder &builder, Location loc,
Level lvl, AffineExpr lvlExpr);		TensorLevel tidLvl, AffineExpr lvlExpr);

/// Emits a co-iteration loop over a set of tensors.		/// Emits a co-iteration loop over a set of tensors.
Operation *enterCoIterationOverTensorsAtLvls(		Operation *enterCoIterationOverTensorsAtLvls(
OpBuilder &builder, Location loc, ArrayRef<TensorId> tids,		OpBuilder &builder, Location loc, ArrayRef<TensorLevel> tidLvls,
ArrayRef<Level> lvls, bool needsUniv, MutableArrayRef<Value> reduc = {});		bool needsUniv, MutableArrayRef<Value> reduc = {});

void exitCurrentLoop(RewriterBase &rewriter, Location loc,		void exitCurrentLoop(RewriterBase &rewriter, Location loc,
MutableArrayRef<Value> reduc = {});		MutableArrayRef<Value> reduc = {});

/// Fills the out-parameter with the loop induction variables for all		/// Fills the out-parameter with the loop induction variables for all
/// loops in the current loop-stack. The variables are given in the		/// loops in the current loop-stack. The variables are given in the
/// same order as the loop-stack, hence `ivs` should be indexed into		/// same order as the loop-stack, hence `ivs` should be indexed into
/// by `LoopOrd` (not `LoopId`).		/// by `LoopOrd` (not `LoopId`).
void getLoopIVs(SmallVectorImpl<Value> &ivs) const {		void getLoopIVs(SmallVectorImpl<Value> &ivs) const {
ivs.clear();		ivs.clear();
ivs.reserve(getCurrentDepth());		ivs.reserve(getCurrentDepth());
for (auto &l : loopStack)		for (auto &l : loopStack)
ivs.push_back(l.iv);		ivs.push_back(l.iv);
}		}

/// Gets the current depth of the loop-stack. The result is given		/// Gets the current depth of the loop-stack. The result is given
/// the type `LoopOrd` for the same reason as one-past-the-end iterators.		/// the type `LoopOrd` for the same reason as one-past-the-end iterators.
LoopOrd getCurrentDepth() const { return loopStack.size(); }		LoopOrd getCurrentDepth() const { return loopStack.size(); }

/// Gets loop induction variable for the given `LoopOrd`.		/// Gets loop induction variable for the given `LoopOrd`.
Value getLoopIV(LoopOrd n) const {		Value getLoopIV(LoopOrd n) const {
return n < getCurrentDepth() ? loopStack[n].iv : Value();		return n < getCurrentDepth() ? loopStack[n].iv : Value();
}		}

		/// Gets the total number of tensors that loopEmitter is operating on.
		aartbikUnsubmitted Done Reply Inline Actions Gets aartbik: Gets
		unsigned getNumTensors() const { return tensors.size(); }

		/// Compresses a TensorId and Level into a TensorLevel.
		constexpr TensorLevel makeTensorLevel(TensorId t, Level l) const {
		return l * getNumTensors() + t;
		}

		/// De-compresses a TensorLevel back to a pair of TensorId and Level.
		std::pair<TensorId, Level> toTidLvlPair(TensorLevel tidLvl) const {
		wrengrUnsubmitted Done Reply Inline Actions I think a name like `unpackTensorLevel` would better match `makeTensorLevel`. wrengr: I think a name like `unpackTensorLevel` would better match `makeTensorLevel`.
		unsigned nt = getNumTensors();
		return std::make_pair(tidLvl % nt, tidLvl / nt);
		}

		// Maps a Container<TensorLevel> into a range<std::pair<TensorId, Level>>.
		aartbikUnsubmitted Done Reply Inline Actions Give this one a doc string too aartbik: Give this one a doc string too
		wrengrUnsubmitted Done Reply Inline Actions This should use "///" since it's documentation not a comment wrengr: This should use "///" since it's documentation not a comment
		wrengrUnsubmitted Done Reply Inline Actions I get what you're after by saying "`Container<>`" and "`range<>`", however since those are not actual names of templates, it'd be better to rephrase this to "Converts a range of `TensorLevel` to a range of `std::pair<TensorId, Level>`." wrengr: I get what you're after by saying "`Container<>`" and "`range<>`", however since those are not…
		template <class ContainerTy,
		std::enable_if_t<std::is_same_v<typename std::remove_reference_t<
		wrengrUnsubmitted Done Reply Inline Actions I think you'll want to add `std::remove_const_t` there too, since that's what all the analogous sfinae code in MLIR does for this sort of thing. wrengr: I think you'll want to add `std::remove_const_t` there too, since that's what all the analogous…
		wrengrUnsubmitted Done Reply Inline Actions it should be in the order `remove_reference_t<remove_const_t<`, at least that's the order I see elsewhere, haven't checked to see if/how it matters wrengr: it should be in the order `remove_reference_t<remove_const_t<`, at least that's the order I see…
		PeimingAuthorUnsubmitted Done Reply Inline Actions I don't think it matters, but yeah, good to be consistent anyway. Peiming: I don't think it matters, but yeah, good to be consistent anyway.
		PeimingAuthorUnsubmitted Done Reply Inline Actions For the record, the order matters https://stackoverflow.com/questions/30690269/why-does-using-stdremove-reference-and-stdremove-const-in-different-order-pr Peiming: For the record, the order matters https://stackoverflow.com/questions/30690269/why-does-using…
		ContainerTy>::value_type,
		TensorLevel>,
		bool> = true>
		auto toTidLvlPairRange(ContainerTy &&c) const {
		wrengrUnsubmitted Done Reply Inline Actions Mirroring the earlier suggestion, this would then be `unpackTensorLevelRange` wrengr: Mirroring the earlier suggestion, this would then be `unpackTensorLevelRange`
		return llvm::map_range(
		c, [this](TensorLevel tl) { return this->toTidLvlPair(tl); });
		wrengrUnsubmitted Done Reply Inline Actions this should be `std::forward<ContainerTy>(c)` wrengr: this should be `std::forward<ContainerTy>(c)`
		}

///		///
/// Getters.		/// Getters.
///		///
const std::vector<std::vector<Value>> &getPosits() const { return posits; };		const std::vector<std::vector<Value>> &getPosits() const { return posits; };
const std::vector<std::vector<Value>> &getCoords() const { return coords; };		const std::vector<std::vector<Value>> &getCoords() const { return coords; };
const std::vector<std::vector<Value>> &getHighs() const { return highs; };		const std::vector<std::vector<Value>> &getHighs() const { return highs; };
const std::vector<std::vector<Value>> &getPositionBuffers() const {		const std::vector<std::vector<Value>> &getPositionBuffers() const {
return positionsBuffers;		return positionsBuffers;
};		};
const std::vector<std::vector<Value>> &getCoordinateBuffers() const {		const std::vector<std::vector<Value>> &getCoordinateBuffers() const {
return coordinatesBuffers;		return coordinatesBuffers;
};		};
const std::vector<Value> &getValBuffer() const { return valBuffer; };		const std::vector<Value> &getValBuffer() const { return valBuffer; };

constexpr static llvm::StringLiteral getLoopEmitterLoopAttrName() {		constexpr static llvm::StringLiteral getLoopEmitterLoopAttrName() {
return llvm::StringLiteral("Emitted from");		return llvm::StringLiteral("Emitted from");
}		}

private:		private:
		// A tuple that stored the slice-driven loop information.
		using SliceLoopTuple = std::tuple<TensorId, Level, bool /reduced/>;
		wrengrUnsubmitted Done Reply Inline Actions I think you should just define the `struct` directly, rather than using `std::tuple`. That way client code can use nice names for accessing the fields directly, rather than defining all those wrappers around `std::get<N>`. Also, the struct should be named `LoopSliceInfo` for consistency with `LoopInfo`, `SliceInfo`, etc. And, of course, the struct should be `final`. If you are concerned about the ability to use structured-binding in the foreach-loops, then take a look at the "Case 3: binding to data members" section of https://en.cppreference.com/w/cpp/language/structured_binding . tl;dr: the syntax/semantics should be exactly the same as for the `std::tuple` case. wrengr: I think you should just define the `struct` directly, rather than using `std::tuple`. That way…
		PeimingAuthorUnsubmitted Done Reply Inline Actions Actually I delete the wrappers, I find them useless. We only use in in structure-bindings in foreach loop, and this is a private type so it should be okay? WDYT? Peiming: Actually I delete the wrappers, I find them useless. We only use in in structure-bindings in…
		wrengrUnsubmitted Done Reply Inline Actions I still think we should define our own struct rather than reusing `std::tuple`. I feel like the `std::pair`/`std::tuple` classes are really only intended to be used for those cases where folks need an ad-hoc struct; hence, whenever things aren't ad-hoc then they should be given their own definitions. wrengr: I still think we should define our own struct rather than reusing `std::tuple`. I feel like the…
		wrengrUnsubmitted Done Reply Inline Actions Should be `final`. wrengr: Should be `final`.

// LoopInfo stores information of a loop generated by LoopEmitter. E.g.,		// LoopInfo stores information of a loop generated by LoopEmitter. E.g.,
// the set of tensors levels that the loop is iterating over.		// the set of tensors levels that the loop is iterating over.
		wrengrUnsubmitted Done Reply Inline Actions style-guide says that the fields should be at the end of the defn wrengr: style-guide says that the fields should be at the end of the defn
		PeimingAuthorUnsubmitted Done Reply Inline Actions Good to know! Peiming: Good to know!
struct LoopInfo final {		struct LoopInfo final {
LoopInfo(ArrayRef<TensorId> tids, ArrayRef<Level> lvls,		LoopInfo(ArrayRef<TensorLevel> tidLvls,
ArrayRef<TensorId> slicedTids, ArrayRef<Level> slicedLvls,		ArrayRef<SliceLoopTuple> sliceDrivenInfo, Operation *loop,
ArrayRef<bool> sliceReduced, Operation loop, Block userBlock,		Block *userBlock, Value iv, StringAttr loopTag)
Value iv, StringAttr loopTag)		: tidLvls(tidLvls), sliceDrivenInfo(sliceDrivenInfo), loop(loop),
: tids(tids), lvls(lvls), slicedTids(slicedTids),
slicedLvls(slicedLvls), sliceReduced(sliceReduced), loop(loop),
userCodeBlock(userBlock), iv(iv) {		userCodeBlock(userBlock), iv(iv) {
		wrengrUnsubmitted Done Reply Inline Actions I think the name `tidLvls` would be a lot cleaner here, and consistent with elsewhere. wrengr: I think the name `tidLvls` would be a lot cleaner here, and consistent with elsewhere.
// Attached a special tag to loop emitter generated loop.		// Attached a special tag to loop emitter generated loop.
if (loopTag)		if (loopTag)
loop->setAttr(LoopEmitter::getLoopEmitterLoopAttrName(), loopTag);		loop->setAttr(LoopEmitter::getLoopEmitterLoopAttrName(), loopTag);
}		}
// TODO: maybe use a vector<pair> for tid and lvl?		// The set of <tensors, lvl> that the loop is operating on
// (Or compress them together with a `TensorLoopId`.)		const llvm::SmallVector<TensorLevel> tidLvls;
// The set of tensors that the loop is operating on		// Slice-driven loop conditions.
		wrengrUnsubmitted Done Reply Inline Actions Grammatically, that should be "information" (singular, and lowercase). However, I think it'd be better to leave it as "slice-driven loop conditions" like it was before, since the term "information" doesn't really tell us very much about what exactly that means/contains. wrengr: Grammatically, that should be "information" (singular, and lowercase). However, I think it'd be…
const llvm::SmallVector<TensorId> tids;		const llvm::SmallVector<SliceLoopTuple> sliceDrivenInfo;
		wrengrUnsubmitted Done Reply Inline Actions I think it'd be best to leave this as "operating on" like it was before. Although this is "iterating" in the sparse-compiler sense of enumerating the stored values of sparse-tensors, I feel that using that term here would cause confusion vs the C++ iterator framework wrengr: I think it'd be best to leave this as "operating on" like it was before. Although this is…
		wrengrUnsubmitted Done Reply Inline Actions Should be just "tensor" wrengr: Should be just "tensor"
// The corresponding levels for the tensors
const llvm::SmallVector<Level> lvls;
// The set of tensors for slice-driven loop conditions.
const llvm::SmallVector<TensorId> slicedTids;
// The corresponding level for slice-driven tensors.
const llvm::SmallVector<Level> slicedLvls;
// Whether the tensor is fully reduced (e.g., i + j => j).
const llvm::SmallVector<bool> sliceReduced;
const Operation *loop; // the loop operation		const Operation *loop; // the loop operation
Block *const userCodeBlock; // the block holding users' generated code.		Block *const userCodeBlock; // the block holding users' generated code.
const Value iv; // the induction variable for the loop		const Value iv; // the induction variable for the loop
};		};

// SliceInfo stores information of an extracted slice for slice-driven loop.		// SliceInfo stores information of an extracted slice for slice-driven loop.
// E.g., the in-scope SSA values for the minimum coordinates and offset for		// E.g., the in-scope SSA values for the minimum coordinates and offset for
// the slice, etc.		// the slice, etc.
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	private:

/// Generates a predicate to determine whether the tranformed coordinates are		/// Generates a predicate to determine whether the tranformed coordinates are
/// in the given slice.		/// in the given slice.
/// Returns std::pair<Transformed coordinates, Predicate>		/// Returns std::pair<Transformed coordinates, Predicate>
std::pair<Value, Value> genSliceLegitPredicate(OpBuilder &builder,		std::pair<Value, Value> genSliceLegitPredicate(OpBuilder &builder,
Location loc, Value crd,		Location loc, Value crd,
TensorId tid, Level lvl);		TensorId tid, Level lvl);

unsigned getNumTensors() const { return tensors.size(); }

bool isOutputTensor(TensorId tid) const {		bool isOutputTensor(TensorId tid) const {
return hasOutput && tid == getNumTensors() - 1;		return hasOutput && tid == getNumTensors() - 1;
}		}

bool isSparseOutput(TensorId tid) const {		bool isSparseOutput(TensorId tid) const {
return isOutputTensor(tid) && isSparseOut;		return isOutputTensor(tid) && isSparseOut;
}		}

bool isValidLevel(TensorId tid, Level lvl) const {		bool isValidLevel(TensorId tid, Level lvl) const {
return tid < lvlTypes.size() && lvl < lvlTypes[tid].size();		return tid < lvlTypes.size() && lvl < lvlTypes[tid].size();
}		}

/// Prepares loop for iterating over `tensor[lvl]`, under the assumption		/// Prepares loop for iterating over `tensor[lvl]`, under the assumption
/// that `tensor[0...lvl-1]` loops have already been set up.		/// that `tensor[0...lvl-1]` loops have already been set up.
void prepareLoopOverTensorAtLvl(OpBuilder &builder, Location loc,		void prepareLoopOverTensorAtLvl(OpBuilder &builder, Location loc,
TensorId tid, Level lvl);		TensorId tid, Level lvl);

/// Emits extra locals, since the locals might not be in simplified lattices		/// Emits extra locals, since the locals might not be in simplified lattices
/// point used to generate the loops, but are still required to generate		/// point used to generate the loops, but are still required to generate
/// expressions.		/// expressions.
void emitExtraLocalsForTensorsAtDenseLvls(OpBuilder &builder, Location loc,		void emitExtraLocalsForTensorsAtDenseLvls(OpBuilder &builder, Location loc,
ArrayRef<TensorId> tids,		ArrayRef<TensorLevel> tidLvls);
ArrayRef<Level> lvls);

/// Emits a for loop to iterate over a tensor level with the provided lower		/// Emits a for loop to iterate over a tensor level with the provided lower
/// bound `lo` and upper bound `hi`.		/// bound `lo` and upper bound `hi`.
/// Apart from iterating just single tensor level, for loops can be used for		/// Apart from iterating just single tensor level, for loops can be used for
/// slice-driven loop on dense level too.		/// slice-driven loop on dense level too.
Operation *emitForLoopOverTensorAtLvl(OpBuilder &builder, Location loc,		Operation *emitForLoopOverTensorAtLvl(OpBuilder &builder, Location loc,
TensorId tid, Level lvl, Value lo,		TensorId tid, Level lvl, Value lo,
Value hi, MutableArrayRef<Value> reduc,		Value hi, MutableArrayRef<Value> reduc,
▲ Show 20 Lines • Show All 287 Lines • Show Last 20 Lines

mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.cpp

Show First 20 Lines • Show All 469 Lines • ▼ Show 20 Lines	for (Level lvl = 0; lvl < lvlRank; lvl++) {
}		}
}		}
}		}
}		}
localInsertPos = builder.getInsertionPoint()->getPrevNode();		localInsertPos = builder.getInsertionPoint()->getPrevNode();
}		}

void LoopEmitter::enterNewLoopSeq(OpBuilder &builder, Location loc,		void LoopEmitter::enterNewLoopSeq(OpBuilder &builder, Location loc,
ArrayRef<TensorId> tids,		ArrayRef<TensorLevel> tidLvls) {
ArrayRef<Level> lvls) {
// TODO: sort		// TODO: sort
assert(loopSeqStack.size() == loopStack.size());		assert(loopSeqStack.size() == loopStack.size());
// Prepares for all the tensors used in the current loop sequence.		// Prepares for all the tensors used in the current loop sequence.
std::vector<std::tuple<TensorId, Level, bool>> slicedTids;		std::vector<std::tuple<TensorId, Level, bool>> slicedTids;
for (auto [tid, lvl] : llvm::zip(tids, lvls)) {		for (auto [tid, lvl] : toTidLvlPairRange(tidLvls)) {
if (!dependentLvlMap[tid][lvl].empty()) {		if (!dependentLvlMap[tid][lvl].empty()) {
bool fullyRed = genSliceBegin(builder, loc, tid, lvl);		bool fullyRed = genSliceBegin(builder, loc, tid, lvl);
slicedTids.emplace_back(tid, lvl, fullyRed);		slicedTids.emplace_back(tid, lvl, fullyRed);
} else {		} else {
prepareLoopOverTensorAtLvl(builder, loc, tid, lvl);		prepareLoopOverTensorAtLvl(builder, loc, tid, lvl);
}		}
}		}

▲ Show 20 Lines • Show All 181 Lines • ▼ Show 20 Lines	Operation *loop =
coords[tid][lvl] = insertPoint->getResult(0);		coords[tid][lvl] = insertPoint->getResult(0);
})		})
.first;		.first;
// Sets the insertionn pointer inside loop body.		// Sets the insertionn pointer inside loop body.
builder.setInsertionPointAfter(insertPoint);		builder.setInsertionPointAfter(insertPoint);
return loop;		return loop;
}		}

Operation *LoopEmitter::enterLoopOverTensorAtLvl(		Operation *LoopEmitter::enterLoopOverTensorAtLvl(OpBuilder &builder,
OpBuilder &builder, Location loc, ArrayRef<TensorId> tids,		Location loc,
ArrayRef<Level> lvls, MutableArrayRef<Value> reduc, bool isParallel) {		ArrayRef<TensorLevel> tidLvls,
		MutableArrayRef<Value> reduc,
		bool isParallel) {
// TODO: support multiple return on parallel for?		// TODO: support multiple return on parallel for?
assert(!isParallel \|\| reduc.size() <= 1);		assert(!isParallel \|\| reduc.size() <= 1);
bool isSparseCond = false, isSparseSliceCond = false;		bool isSparseCond = false, isSparseSliceCond = false;
size_t tid = tids.front(), lvl = lvls.front();		auto [tid, lvl] = toTidLvlPair(tidLvls.front());

// Finds out the tensor level that we should use to generate loops. Amongs all		// Finds out the tensor level that we should use to generate loops. Amongs all
// the tensor levels, there is at most one sparse tensor level.		// the tensor levels, there is at most one sparse tensor level.
for (auto [t, l] : llvm::zip(tids, lvls)) {		for (auto [t, l] : toTidLvlPairRange(tidLvls)) {
assert(lvlTypes[t].size() > l); // Must be a valid tid, dim pair		assert(lvlTypes[t].size() > l); // Must be a valid tid, dim pair
assert(!coords[t][l] \|\| // We cannot re-enter the same level		assert(!coords[t][l] \|\| // We cannot re-enter the same level
!dependentLvlMap[t][l].empty()); // unless it is a slice-driver loop		!dependentLvlMap[t][l].empty()); // unless it is a slice-driver loop
auto lvlType = lvlTypes[t][l];		auto lvlType = lvlTypes[t][l];
// Must be a recognizable DLT.		// Must be a recognizable DLT.
assert(isDenseDLT(lvlType) \|\| isCompressedDLT(lvlType) \|\|		assert(isDenseDLT(lvlType) \|\| isCompressedDLT(lvlType) \|\|
isCompressedWithHiDLT(lvlType) \|\| isSingletonDLT(lvlType));		isCompressedWithHiDLT(lvlType) \|\| isSingletonDLT(lvlType));

Show All 24 Lines	Operation *LoopEmitter::enterLoopOverTensorAtLvl(OpBuilder &builder,
bool isDenseSliceCond =		bool isDenseSliceCond =
isDenseDLT(lvlType) && !dependentLvlMap[tid][lvl].empty();		isDenseDLT(lvlType) && !dependentLvlMap[tid][lvl].empty();
// if the slice is fully reduced, we can now use TACO-based algorithm to		// if the slice is fully reduced, we can now use TACO-based algorithm to
// iterate it.		// iterate it.

Operation *l = nullptr;		Operation *l = nullptr;

// At most one tensor used as condition in for loop;		// At most one tensor used as condition in for loop;
SmallVector<TensorId, 1> condTid;		SmallVector<TensorLevel, 1> condTidLvl;
SmallVector<Level, 1> condLvl;
// There Might be multiple dense slice driven tensor.		// There Might be multiple dense slice driven tensor.
SmallVector<TensorId> sliceTids;		SmallVector<SliceLoopTuple> sliceDrivenTuples;
SmallVector<Level> sliceLvls;
SmallVector<bool> sliceReduc;

// Generates loops differently depending on whether we need a slice-driven		// Generates loops differently depending on whether we need a slice-driven
// loop or a simple level traversal loop.		// loop or a simple level traversal loop.
if (isSparseSliceCond) {		if (isSparseSliceCond) {
bool fullyReduced = depFullyReduced(tid, lvl);		bool fullyReduced = depFullyReduced(tid, lvl);
if (!fullyReduced) {		if (!fullyReduced) {
l = emitSliceDrivenLoopOverTensorAtLvl(builder, loc, tid, lvl, reduc);		l = emitSliceDrivenLoopOverTensorAtLvl(builder, loc, tid, lvl, reduc);
} else {		} else {
// If the slice is fully reduced, we can now use TACO-based algorithm to		// If the slice is fully reduced, we can now use TACO-based algorithm to
// iterate it.		// iterate it.
l = emitWhileLoopOverSliceAtSparseLvl(		l = emitWhileLoopOverSliceAtSparseLvl(
builder, loc, posits[tid][lvl], highs[tid][lvl],		builder, loc, posits[tid][lvl], highs[tid][lvl],
getFinalSliceOnLvl(tid, lvl).offset, sliceSizes[tid][lvl].back(), tid,		getFinalSliceOnLvl(tid, lvl).offset, sliceSizes[tid][lvl].back(), tid,
lvl, reduc);		lvl, reduc);
}		}
levelReducedDep[tid][lvl]++;		levelReducedDep[tid][lvl]++;
sliceTids.push_back(tid);		sliceDrivenTuples.emplace_back(tid, lvl, fullyReduced);
sliceLvls.push_back(lvl);
sliceReduc.push_back(fullyReduced);
} else {		} else {
Value lo = isSparseCond ? posits[tid][lvl] // current offset		Value lo = isSparseCond ? posits[tid][lvl] // current offset
: loopSeqStack.back().first; // universal index		: loopSeqStack.back().first; // universal index
Value hi = highs[tid][lvl];		Value hi = highs[tid][lvl];
if (isDenseSliceCond) {		if (isDenseSliceCond) {
bool fullyReduced = depFullyReduced(tid, lvl);		bool fullyReduced = depFullyReduced(tid, lvl);
Value sliceSz = sliceSizes[tid][lvl][sliceStack[tid].back().depth - 1];		Value sliceSz = sliceSizes[tid][lvl][sliceStack[tid].back().depth - 1];
// Adjust for loop hi for dense slice-driven loop.		// Adjust for loop hi for dense slice-driven loop.
if (fullyReduced) {		if (fullyReduced) {
hi = sliceSz;		hi = sliceSz;
condTid.push_back(tid);		condTidLvl.emplace_back(makeTensorLevel(tid, lvl));
		wrengrUnsubmitted Done Reply Inline Actions It's more appropriate to use `push_back` here, since the return type of `makeTensorLevel` is exactly what's being stored (i.e., it's not being passed to some constructor for what's actually stored) wrengr: It's more appropriate to use `push_back` here, since the return type of `makeTensorLevel` is…
condLvl.push_back(lvl);
} else {		} else {
hi = SUBI(lvlSizes[tid][lvl], sliceSz);		hi = SUBI(lvlSizes[tid][lvl], sliceSz);
hi = ADDI(hi, C_IDX(1));		hi = ADDI(hi, C_IDX(1));
}		}
} else {		} else {
condTid.push_back(tid);		condTidLvl.emplace_back(makeTensorLevel(tid, lvl));
		wrengrUnsubmitted Done Reply Inline Actions ditto, should be `push_back` wrengr: ditto, should be `push_back`
condLvl.push_back(lvl);
}		}
l = emitForLoopOverTensorAtLvl(builder, loc, tid, lvl, lo, hi, reduc,		l = emitForLoopOverTensorAtLvl(builder, loc, tid, lvl, lo, hi, reduc,
isParallel);		isParallel);
}		}
Value iv = coords[tid][lvl];		Value iv = coords[tid][lvl];
for (auto [t, l] : llvm::zip(tids, lvls)) {		for (auto [t, l] : toTidLvlPairRange(tidLvls)) {
// We only need to handle slice-driven loops on dense level here.		// We only need to handle slice-driven loops on dense level here.
// If it is a slice-driven loop on sparse level, it needs a while loop to		// If it is a slice-driven loop on sparse level, it needs a while loop to
// insert break statements, and it must have been handled correctly in L692.		// insert break statements, and it must have been handled correctly in L692.
if (!dependentLvlMap[t][l].empty() && isDenseDLT(lvlTypes[t][l])) {		if (!dependentLvlMap[t][l].empty() && isDenseDLT(lvlTypes[t][l])) {
// Pushes sliced levels to build correct LoopInfo.		// Pushes sliced levels to build correct LoopInfo.
bool fullyReduc = depFullyReduced(t, l);		bool fullyReduc = depFullyReduced(t, l);
SliceInfo &info = sliceStack[t].back();		SliceInfo &info = sliceStack[t].back();
if (fullyReduc) {		if (fullyReduc) {
posits[t][l] = genAddress(builder, loc, t, l, ADDI(info.offset, iv));		posits[t][l] = genAddress(builder, loc, t, l, ADDI(info.offset, iv));
} else {		} else {
// Puts sliced dense loop into LoopInfo so that LoopEmitter knows how to		// Puts sliced dense loop into LoopInfo so that LoopEmitter knows how to
// exit it.		// exit it.
sliceTids.push_back(t);		sliceDrivenTuples.emplace_back(t, l, fullyReduc);
sliceLvls.push_back(l);
sliceReduc.push_back(fullyReduc);
// Update the slice information as we enter the new loop.		// Update the slice information as we enter the new loop.
assert(*info.slicedOnLvl == l);		assert(*info.slicedOnLvl == l);
info.minCrd = info.offset = iv;		info.minCrd = info.offset = iv;
info.isNonEmpty = constantI1(builder, loc, true);		info.isNonEmpty = constantI1(builder, loc, true);
levelReducedDep[t][l]++;		levelReducedDep[t][l]++;
}		}
}		}
}		}
// NOTE: we can also prepare for next dim here in advance		// NOTE: we can also prepare for next dim here in advance
// Pushes the loop into stack.		// Pushes the loop into stack.
loopStack.emplace_back(condTid, condLvl, sliceTids, sliceLvls, sliceReduc, l,		loopStack.emplace_back(condTidLvl, sliceDrivenTuples, l,
builder.getInsertionBlock(), iv, loopTag);		builder.getInsertionBlock(), iv, loopTag);
// Emit extra locals.		// Emit extra locals.
emitExtraLocalsForTensorsAtDenseLvls(builder, loc, tids, lvls);		emitExtraLocalsForTensorsAtDenseLvls(builder, loc, tidLvls);
return l;		return l;
}		}

Operation *LoopEmitter::enterFilterLoopOverTensorAtLvl(		Operation *LoopEmitter::enterFilterLoopOverTensorAtLvl(
OpBuilder &builder, Location loc, TensorId tid, Level lvl,		OpBuilder &builder, Location loc, TensorId tid, Level lvl,
AffineExpr affine, MutableArrayRef<Value> reduc) {		AffineExpr affine, MutableArrayRef<Value> reduc) {
assert(isValidLevel(tid, lvl));		assert(isValidLevel(tid, lvl));
assert(!affine.isa<AffineDimExpr>() && !isDenseDLT(lvlTypes[tid][lvl]));		assert(!affine.isa<AffineDimExpr>() && !isDenseDLT(lvlTypes[tid][lvl]));
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	if (hasReduc) {
// On mismatch.		// On mismatch.
YIELD(reduc);		YIELD(reduc);
}		}
// Set the insert point to matched branch.		// Set the insert point to matched branch.
builder.setInsertionPointToStart(&ifOp.getThenRegion().front());		builder.setInsertionPointToStart(&ifOp.getThenRegion().front());

// NOTE: we can also prepare for next lvl here in advance		// NOTE: we can also prepare for next lvl here in advance
// Push the loop into stack		// Push the loop into stack
loopStack.emplace_back(ArrayRef<TensorId>(tid), ArrayRef<Level>(lvl),		loopStack.emplace_back(ArrayRef<TensorLevel>(makeTensorLevel(tid, lvl)),
ArrayRef<TensorId>(), ArrayRef<Level>(),		ArrayRef<SliceLoopTuple>(), forOp,
ArrayRef<bool>(), forOp, builder.getInsertionBlock(),		builder.getInsertionBlock(), coords[tid][lvl],
coords[tid][lvl], nullptr);		nullptr);
return forOp;		return forOp;
}		}

void LoopEmitter::genDenseAffineAddress(OpBuilder &builder, Location loc,		void LoopEmitter::genDenseAffineAddress(OpBuilder &builder, Location loc,
TensorId tid, Level lvl,		TensorLevel tidLvl,
AffineExpr lvlExpr) {		AffineExpr lvlExpr) {
		auto [tid, lvl] = toTidLvlPair(tidLvl);
assert(isDenseDLT(lvlTypes[tid][lvl]));		assert(isDenseDLT(lvlTypes[tid][lvl]));
// For dense levels, the level-coordinate also serves as the position.		// For dense levels, the level-coordinate also serves as the position.
Value lvlCrd = genAffine(builder, loc, lvlExpr);		Value lvlCrd = genAffine(builder, loc, lvlExpr);
posits[tid][lvl] = genAddress(builder, loc, tid, lvl, lvlCrd);		posits[tid][lvl] = genAddress(builder, loc, tid, lvl, lvlCrd);
}		}

Operation *LoopEmitter::enterCoIterationOverTensorsAtLvls(		Operation *LoopEmitter::enterCoIterationOverTensorsAtLvls(
OpBuilder &builder, Location loc, ArrayRef<TensorId> tids,		OpBuilder &builder, Location loc, ArrayRef<TensorLevel> tidLvls,
ArrayRef<Level> lvls, bool needsUniv, MutableArrayRef<Value> reduc) {		bool needsUniv, MutableArrayRef<Value> reduc) {
// NOTE: the slice driven tensor-related reduction variable must		// NOTE: the slice driven tensor-related reduction variable must
// appear before normal tensors.		// appear before normal tensors.
assert(tids.size() == lvls.size());
SmallVector<Type> types;		SmallVector<Type> types;
SmallVector<Value> operands;		SmallVector<Value> operands;
// Construct the while-loop with a parameter for each coordinate.		// Construct the while-loop with a parameter for each coordinate.
const Type indexType = builder.getIndexType();		const Type indexType = builder.getIndexType();
for (auto [tid, lvl] : llvm::zip(tids, lvls)) {		for (auto [tid, lvl] : toTidLvlPairRange(tidLvls)) {
// TODO: support coiteration with slice driven tensors.		// TODO: support coiteration with slice driven tensors.
const auto lvlTp = lvlTypes[tid][lvl];		const auto lvlTp = lvlTypes[tid][lvl];
assert(dependentLvlMap[tid][lvl].empty() && "TODO: not yet implemented");		assert(dependentLvlMap[tid][lvl].empty() && "TODO: not yet implemented");
if (isCompressedDLT(lvlTp) \|\| isSingletonDLT(lvlTp) \|\|		if (isCompressedDLT(lvlTp) \|\| isSingletonDLT(lvlTp) \|\|
isCompressedWithHiDLT(lvlTp)) {		isCompressedWithHiDLT(lvlTp)) {
const auto reassoc = getCollapseReassociation(tid, lvl);		const auto reassoc = getCollapseReassociation(tid, lvl);
for (unsigned i = 0, e = reassoc.size() - 1; i < e; i++) {		for (unsigned i = 0, e = reassoc.size() - 1; i < e; i++) {
if (!isUniqueDLT(lvlTypes[tid][reassoc[i]])) {		if (!isUniqueDLT(lvlTypes[tid][reassoc[i]])) {
Show All 25 Lines	Operation *LoopEmitter::enterCoIterationOverTensorsAtLvls(
Block *before = builder.createBlock(&whileOp.getBefore(), {}, types, locs);		Block *before = builder.createBlock(&whileOp.getBefore(), {}, types, locs);
Block *after = builder.createBlock(&whileOp.getAfter(), {}, types, locs);		Block *after = builder.createBlock(&whileOp.getAfter(), {}, types, locs);

// Build the "before" region, which effectively consists		// Build the "before" region, which effectively consists
// of a conjunction of "i < upper" tests on all induction.		// of a conjunction of "i < upper" tests on all induction.
builder.setInsertionPointToStart(&whileOp.getBefore().front());		builder.setInsertionPointToStart(&whileOp.getBefore().front());
Value cond;		Value cond;
unsigned o = 0;		unsigned o = 0;
for (auto [t, lvl] : llvm::zip(tids, lvls)) {		for (auto [t, lvl] : toTidLvlPairRange(tidLvls)) {
const TensorId tid = t; // Why `t` can not be captured by lambda?		const TensorId tid = t; // Why `t` can not be captured by lambda?
const auto lvlTp = lvlTypes[tid][lvl];		const auto lvlTp = lvlTypes[tid][lvl];
if (isCompressedDLT(lvlTp) \|\| isSingletonDLT(lvlTp) \|\|		if (isCompressedDLT(lvlTp) \|\| isSingletonDLT(lvlTp) \|\|
isCompressedWithHiDLT(lvlTp)) {		isCompressedWithHiDLT(lvlTp)) {
const auto reassoc = getCollapseReassociation(tid, lvl);		const auto reassoc = getCollapseReassociation(tid, lvl);
assert(reassoc.size() == 1 \|\| isUniqueCOOType(tensors[tid].getType()));		assert(reassoc.size() == 1 \|\| isUniqueCOOType(tensors[tid].getType()));
for (unsigned i = 0, e = reassoc.size() - 1; i < e; i++) {		for (unsigned i = 0, e = reassoc.size() - 1; i < e; i++) {
if (!isUniqueDLT(lvlTypes[tid][reassoc[i]])) {		if (!isUniqueDLT(lvlTypes[tid][reassoc[i]])) {
Show All 17 Lines	Operation *LoopEmitter::enterCoIterationOverTensorsAtLvls(
}		}
builder.create<scf::ConditionOp>(loc, cond, before->getArguments());		builder.create<scf::ConditionOp>(loc, cond, before->getArguments());

// Generates while body.		// Generates while body.
builder.setInsertionPointToStart(&whileOp.getAfter().front());		builder.setInsertionPointToStart(&whileOp.getAfter().front());

SmallVector<std::pair<Value, unsigned>> slicesPreds;		SmallVector<std::pair<Value, unsigned>> slicesPreds;
unsigned i = 0;		unsigned i = 0;
for (auto [tid, lvl] : llvm::zip(tids, lvls)) {		for (auto [tid, lvl] : toTidLvlPairRange(tidLvls)) {
// Prepares for next level.		// Prepares for next level.
const auto lvlTp = lvlTypes[tid][lvl];		const auto lvlTp = lvlTypes[tid][lvl];
if (isCompressedDLT(lvlTp) \|\| isSingletonDLT(lvlTp) \|\|		if (isCompressedDLT(lvlTp) \|\| isSingletonDLT(lvlTp) \|\|
isCompressedWithHiDLT(lvlTp)) {		isCompressedWithHiDLT(lvlTp)) {
coords[tid][lvl] = genSparseCrd(builder, loc, tid, lvl);		coords[tid][lvl] = genSparseCrd(builder, loc, tid, lvl);
if (isSparseSlices[tid]) {		if (isSparseSlices[tid]) {
auto [trans, pred] =		auto [trans, pred] =
genSliceLegitPredicate(builder, loc, coords[tid][lvl], tid, lvl);		genSliceLegitPredicate(builder, loc, coords[tid][lvl], tid, lvl);
Show All 34 Lines	if (!slicesPreds.empty()) {

// If all slices are legit, start the user generated code.		// If all slices are legit, start the user generated code.
builder.setInsertionPointToStart(&ifOp.getThenRegion().front());		builder.setInsertionPointToStart(&ifOp.getThenRegion().front());
}		}

Value min;		Value min;
// Finds the minimum coordinate		// Finds the minimum coordinate
if (!needsUniv) {		if (!needsUniv) {
for (auto [tid, lvl] : llvm::zip(tids, lvls)) {		for (auto [tid, lvl] : toTidLvlPairRange(tidLvls)) {
const auto lvlTp = lvlTypes[tid][lvl];		const auto lvlTp = lvlTypes[tid][lvl];
if (isCompressedDLT(lvlTp) \|\| isSingletonDLT(lvlTp) \|\|		if (isCompressedDLT(lvlTp) \|\| isSingletonDLT(lvlTp) \|\|
isCompressedWithHiDLT(lvlTp)) {		isCompressedWithHiDLT(lvlTp)) {
const auto crd = coords[tid][lvl];		const auto crd = coords[tid][lvl];
if (min) {		if (min) {
Value cmp = CMPI(ult, coords[tid][lvl], min);		Value cmp = CMPI(ult, coords[tid][lvl], min);
min = SELECT(cmp, coords[tid][lvl], min);		min = SELECT(cmp, coords[tid][lvl], min);
} else {		} else {
min = crd;		min = crd;
}		}
}		}
}		}
} else {		} else {
assert(!min);		assert(!min);
// Otherwise, universal index is the minimal pos.		// Otherwise, universal index is the minimal pos.
min = after->getArguments().back();		min = after->getArguments().back();
}		}

// Sets up the loop stack.		// Sets up the loop stack.
loopStack.emplace_back(tids, lvls, ArrayRef<TensorId>(), ArrayRef<Level>(),		loopStack.emplace_back(tidLvls, ArrayRef<SliceLoopTuple>(), whileOp,
ArrayRef<bool>(), whileOp, builder.getInsertionBlock(),		builder.getInsertionBlock(), min, loopTag);
min, loopTag);
assert(loopStack.size() == loopSeqStack.size());		assert(loopStack.size() == loopSeqStack.size());

for (auto [tid, dstLvl] : llvm::zip(tids, lvls)) {		for (auto [tid, dstLvl] : toTidLvlPairRange(tidLvls)) {
const auto reassoc = getCollapseReassociation(tid, dstLvl);		const auto reassoc = getCollapseReassociation(tid, dstLvl);
assert(reassoc.size() == 1 \|\| isUniqueCOOType(tensors[tid].getType()));		assert(reassoc.size() == 1 \|\| isUniqueCOOType(tensors[tid].getType()));
// TODO: Refactors this into smaller functions.		// TODO: Refactors this into smaller functions.
// NOTE: For all the collapsed level (except for the last one, that is why		// NOTE: For all the collapsed level (except for the last one, that is why
// the loop ends with `reassoc.size() - 1`), as each iteration is advanced		// the loop ends with `reassoc.size() - 1`), as each iteration is advanced
// by the segment size of the last level, which does not always invalidate		// by the segment size of the last level, which does not always invalidate
// the segment size for the previous levels, thus we need to propagate the		// the segment size for the previous levels, thus we need to propagate the
// segment sizes across loop iterations and only forward if needed.		// segment sizes across loop iterations and only forward if needed.
Show All 30 Lines	for (auto [tid, dstLvl] : toTidLvlPairRange(tidLvls)) {
const auto srcLvl = reassoc.back();		const auto srcLvl = reassoc.back();
if (!isUniqueDLT(lvlTypes[tid][srcLvl])) {		if (!isUniqueDLT(lvlTypes[tid][srcLvl])) {
segHi[tid][srcLvl] = genSegmentHigh(		segHi[tid][srcLvl] = genSegmentHigh(
builder, loc, tid, srcLvl, posits[tid][srcLvl], highs[tid][srcLvl]);		builder, loc, tid, srcLvl, posits[tid][srcLvl], highs[tid][srcLvl]);
}		}
}		}

// Emits extra locals		// Emits extra locals
emitExtraLocalsForTensorsAtDenseLvls(builder, loc, tids, lvls);		emitExtraLocalsForTensorsAtDenseLvls(builder, loc, tidLvls);

// Updates reduction variables		// Updates reduction variables
assert(after->getNumArguments() == o + reduc.size() + (needsUniv ? 1 : 0));		assert(after->getNumArguments() == o + reduc.size() + (needsUniv ? 1 : 0));
// In-place update on reduction variable.		// In-place update on reduction variable.
for (unsigned i = 0, e = reduc.size(); i < e; i++)		for (unsigned i = 0, e = reduc.size(); i < e; i++)
reduc[i] = after->getArgument(o + i);		reduc[i] = after->getArgument(o + i);

return whileOp;		return whileOp;
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	if (isSingletonDLT(lvlTp)) {
: ADDI(pLo, c1);		: ADDI(pLo, c1);
return;		return;
}		}
}		}

llvm_unreachable("Unrecognized level-type!");		llvm_unreachable("Unrecognized level-type!");
}		}

void LoopEmitter::emitExtraLocalsForTensorsAtDenseLvls(OpBuilder &builder,		void LoopEmitter::emitExtraLocalsForTensorsAtDenseLvls(
Location loc,		OpBuilder &builder, Location loc, ArrayRef<TensorLevel> tidLvls) {
ArrayRef<TensorId> tids,
ArrayRef<Level> lvls) {
// Initialize dense positions. Note that we generate dense coordinates of the		// Initialize dense positions. Note that we generate dense coordinates of the
// output tensor unconditionally, since they may not appear in the lattice,		// output tensor unconditionally, since they may not appear in the lattice,
// but may be needed for linearized codegen.		// but may be needed for linearized codegen.
assert(tids.size() == lvls.size());		for (auto [tid, lvl] : toTidLvlPairRange(tidLvls)) {
for (auto [tid, lvl] : llvm::zip(tids, lvls)) {
if (isDenseDLT(lvlTypes[tid][lvl])) {		if (isDenseDLT(lvlTypes[tid][lvl])) {
// Slice-driven dense level should have be handled already.		// Slice-driven dense level should have be handled already.
if (!dependentLvlMap[tid][lvl].empty())		if (!dependentLvlMap[tid][lvl].empty())
continue;		continue;

auto enc = getSparseTensorEncoding(tensors[tid].getType());		auto enc = getSparseTensorEncoding(tensors[tid].getType());
if (enc && !isSparseOutput(tid)) {		if (enc && !isSparseOutput(tid)) {
bool validPos = lvl == 0 \|\| posits[tid][lvl - 1];		bool validPos = lvl == 0 \|\| posits[tid][lvl - 1];
Show All 10 Lines	for (auto [tid, lvl] : toTidLvlPairRange(tidLvls)) {
}		}
}		}
}		}

void LoopEmitter::exitForLoop(RewriterBase &rewriter, Location loc,		void LoopEmitter::exitForLoop(RewriterBase &rewriter, Location loc,
MutableArrayRef<Value> reduc) {		MutableArrayRef<Value> reduc) {
const LoopInfo &loopInfo = loopStack.back();		const LoopInfo &loopInfo = loopStack.back();
rewriter.setInsertionPointToEnd(loopInfo.userCodeBlock);		rewriter.setInsertionPointToEnd(loopInfo.userCodeBlock);
for (auto [tid, lvl, reduced] : llvm::zip(		for (auto [tid, lvl, reduced] : loopInfo.sliceDrivenInfo) {
loopInfo.slicedTids, loopInfo.slicedLvls, loopInfo.sliceReduced)) {
SliceInfo &info = sliceStack[tid].back();		SliceInfo &info = sliceStack[tid].back();
assert(isDenseDLT(lvlTypes[tid][lvl]));		assert(isDenseDLT(lvlTypes[tid][lvl]));
assert(*info.slicedOnLvl == lvl && !reduced);		assert(*info.slicedOnLvl == lvl && !reduced);
(void)reduced;		(void)reduced;
// Resets slices pointers as the resolved slices are invalidated after we		// Resets slices pointers as the resolved slices are invalidated after we
// moves forward to the next slice.		// moves forward to the next slice.
invalidateSliceIterIdx(rewriter, loc, tid, lvl);		invalidateSliceIterIdx(rewriter, loc, tid, lvl);
info.minCrd = info.offset = info.isNonEmpty = Value();		info.minCrd = info.offset = info.isNonEmpty = Value();
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	#endif // NDEBUG
// In-place update reduction variables.		// In-place update reduction variables.
for (unsigned i = 0, e = parOp.getResults().size(); i < e; i++)		for (unsigned i = 0, e = parOp.getResults().size(); i < e; i++)
reduc[i] = parOp.getResult(i);		reduc[i] = parOp.getResult(i);
}		}

// Finished iterating a tensor, clean up		// Finished iterating a tensor, clean up
// We only do the clean up on for loop as while loops do not necessarily		// We only do the clean up on for loop as while loops do not necessarily
// finish the iteration on a sparse tensor		// finish the iteration on a sparse tensor
for (auto [tid, lvl] : llvm::zip(loopInfo.tids, loopInfo.lvls)) {		for (auto [tid, lvl] : toTidLvlPairRange(loopInfo.tidLvls)) {
// Reset to null.		// Reset to null.
coords[tid][lvl] = Value();		coords[tid][lvl] = Value();
posits[tid][lvl] = Value();		posits[tid][lvl] = Value();
// Dense level, high is fixed.		// Dense level, high is fixed.
if (!isDenseDLT(lvlTypes[tid][lvl]))		if (!isDenseDLT(lvlTypes[tid][lvl]))
highs[tid][lvl] = Value();		highs[tid][lvl] = Value();
}		}
}		}

void LoopEmitter::exitWhileLoop(OpBuilder &builder, Location loc,		void LoopEmitter::exitWhileLoop(OpBuilder &builder, Location loc,
MutableArrayRef<Value> reduc) {		MutableArrayRef<Value> reduc) {
const LoopInfo &loopInfo = loopStack.back();		const LoopInfo &loopInfo = loopStack.back();
auto whileOp = llvm::cast<scf::WhileOp>(loopInfo.loop);		auto whileOp = llvm::cast<scf::WhileOp>(loopInfo.loop);
builder.setInsertionPointToEnd(loopInfo.userCodeBlock);		builder.setInsertionPointToEnd(loopInfo.userCodeBlock);
Value iv = loopInfo.iv;		Value iv = loopInfo.iv;

// Finalize the induction. Note that the induction could be performed		// Finalize the induction. Note that the induction could be performed
// in the individual if-branches to avoid re-evaluating the conditions.		// in the individual if-branches to avoid re-evaluating the conditions.
// However, that would result in a rather elaborate forest of yield		// However, that would result in a rather elaborate forest of yield
// instructions during code generation. Moreover, performing the induction		// instructions during code generation. Moreover, performing the induction
// after the if-statements more closely resembles code generated by TACO.		// after the if-statements more closely resembles code generated by TACO.
unsigned o = 0;		unsigned o = 0;
SmallVector<Value> operands;		SmallVector<Value> operands;
unsigned delta = 0;		unsigned delta = 0;
for (auto [tid, lvl, resolved] : llvm::zip(		for (auto [tid, lvl, resolved] : loopInfo.sliceDrivenInfo) {
loopInfo.slicedTids, loopInfo.slicedLvls, loopInfo.sliceReduced)) {
// TODO: handle dense.		// TODO: handle dense.
assert(isCompressedDLT(lvlTypes[tid][lvl]));		assert(isCompressedDLT(lvlTypes[tid][lvl]));
levelReducedDep[tid][lvl]--;		levelReducedDep[tid][lvl]--;
if (!resolved) {		if (!resolved) {
genSliceNextInduction(builder, loc, whileOp, tid, lvl, operands, o);		genSliceNextInduction(builder, loc, whileOp, tid, lvl, operands, o);
continue;		continue;
}		}
// TODO: We need to distinguish coiterate loop with slice-driven loop and		// TODO: We need to distinguish coiterate loop with slice-driven loop and
// fully reduced while op for iterating one slices.		// fully reduced while op for iterating one slices.
// FIXME: since we didn't implement coiteration, this must be iteration		// FIXME: since we didn't implement coiteration, this must be iteration
// just on fully resolved slice.		// just on fully resolved slice.
assert(loopInfo.slicedTids.size() == 1 && loopInfo.tids.empty());		assert(loopInfo.sliceDrivenInfo.size() == 1 && loopInfo.tidLvls.empty());
// The if guard to filter out out-range coordinates.		// The if guard to filter out out-range coordinates.
assert(llvm::isa<scf::IfOp>(builder.getInsertionBlock()->getParentOp()));		assert(llvm::isa<scf::IfOp>(builder.getInsertionBlock()->getParentOp()));
posits[tid][lvl] = whileOp->getResult(o++);		posits[tid][lvl] = whileOp->getResult(o++);
// FIXME: we are not using continue here since we do not support		// FIXME: we are not using continue here since we do not support
// coiteration on slices. But it need to be treated similarly as the		// coiteration on slices. But it need to be treated similarly as the
// universal index.		// universal index.
o++; // skip continue flag.		o++; // skip continue flag.
// Since we did not push two results from whileOp. The size of the		// Since we did not push two results from whileOp. The size of the
// operands vector is smaller than the actual number of return values from		// operands vector is smaller than the actual number of return values from
// the whileOp.		// the whileOp.
// It is because we are actually generating yield in the IfOp inside the		// It is because we are actually generating yield in the IfOp inside the
// whileOp to only iterates over inbound coordinates within the slices.		// whileOp to only iterates over inbound coordinates within the slices.
delta += 2;		delta += 2;
};		};

Value one = C_IDX(1);		Value one = C_IDX(1);
for (auto [tid, dstLvl] : llvm::zip(loopInfo.tids, loopInfo.lvls)) {		for (auto [tid, dstLvl] : toTidLvlPairRange(loopInfo.tidLvls)) {
const auto lvlTp = lvlTypes[tid][dstLvl];		const auto lvlTp = lvlTypes[tid][dstLvl];
if (isCompressedDLT(lvlTp) \|\| isSingletonDLT(lvlTp) \|\|		if (isCompressedDLT(lvlTp) \|\| isSingletonDLT(lvlTp) \|\|
isCompressedWithHiDLT(lvlTp)) {		isCompressedWithHiDLT(lvlTp)) {
const auto reassoc = getCollapseReassociation(tid, dstLvl);		const auto reassoc = getCollapseReassociation(tid, dstLvl);
assert(reassoc.size() == 1 \|\| isUniqueCOOType(tensors[tid].getType()));		assert(reassoc.size() == 1 \|\| isUniqueCOOType(tensors[tid].getType()));
for (unsigned i = 0, e = reassoc.size() - 1; i < e; i++) {		for (unsigned i = 0, e = reassoc.size() - 1; i < e; i++) {
const Level srcLvl = reassoc[i];		const Level srcLvl = reassoc[i];
if (!isUniqueDLT(lvlTypes[tid][srcLvl])) {		if (!isUniqueDLT(lvlTypes[tid][srcLvl])) {
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	void LoopEmitter::exitWhileLoop(OpBuilder &builder, Location loc,
builder.setInsertionPointAfter(whileOp);		builder.setInsertionPointAfter(whileOp);
}		}

void LoopEmitter::exitCurrentLoop(RewriterBase &rewriter, Location loc,		void LoopEmitter::exitCurrentLoop(RewriterBase &rewriter, Location loc,
MutableArrayRef<Value> reduc) {		MutableArrayRef<Value> reduc) {
// Clean up the values, it would help use to discover potential bug at a		// Clean up the values, it would help use to discover potential bug at a
// earlier stage (instead of silently using a wrong value).		// earlier stage (instead of silently using a wrong value).
const LoopInfo &loopInfo = loopStack.back();		const LoopInfo &loopInfo = loopStack.back();
assert(loopInfo.tids.size() == loopInfo.lvls.size());
SmallVector<Value> red;		SmallVector<Value> red;
if (llvm::isa<scf::WhileOp>(loopInfo.loop)) {		if (llvm::isa<scf::WhileOp>(loopInfo.loop)) {
exitWhileLoop(rewriter, loc, reduc);		exitWhileLoop(rewriter, loc, reduc);
} else {		} else {
exitForLoop(rewriter, loc, reduc);		exitForLoop(rewriter, loc, reduc);
}		}

assert(loopStack.size() == loopSeqStack.size());		assert(loopStack.size() == loopSeqStack.size());
▲ Show 20 Lines • Show All 748 Lines • Show Last 20 Lines

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorRewriting.cpp

Show First 20 Lines • Show All 945 Lines • ▼ Show 20 Lines	LogicalResult matchAndRewrite(ForeachOp op,
// 1. Generates loop for the sparse input.		// 1. Generates loop for the sparse input.
LoopEmitter loopEmitter(		LoopEmitter loopEmitter(
ValueRange{input},		ValueRange{input},
StringAttr::get(getContext(), ForeachOp::getOperationName()));		StringAttr::get(getContext(), ForeachOp::getOperationName()));
loopEmitter.initializeLoopEmit(rewriter, loc);		loopEmitter.initializeLoopEmit(rewriter, loc);
for (Level l = 0; l < lvlRank; l++) {		for (Level l = 0; l < lvlRank; l++) {
// TODO: provide utility function for loop sequences that only contains		// TODO: provide utility function for loop sequences that only contains
// one for loop?		// one for loop?
// FIXME(wrengr): what is this "ld" supposed to be really?		const SmallVector<TensorLevel, 1> tidLvls{
const Level ld = op.getOrder() ? op.getOrder()->getDimPosition(l) : l;		loopEmitter.makeTensorLevel(0, l)};
const SmallVector<TensorId, 1> tids{0};		loopEmitter.enterNewLoopSeq(rewriter, loc, tidLvls);
		wrengrUnsubmitted Done Reply Inline Actions Why keep this defn of `ld` around but commented out? If it's no longer needed, then should remove it (along with my fixme comment about trying to figure out what it's actually supposed to be) wrengr: Why keep this defn of `ld` around but commented out? If it's no longer needed, then should…
loopEmitter.enterNewLoopSeq(rewriter, loc, tids, ld);
// Note that reduc will be taken care of by loop emitter and get updated		// Note that reduc will be taken care of by loop emitter and get updated
// in place.		// in place.
		loopEmitter.enterLoopOverTensorAtLvl(rewriter, loc, tidLvls, reduc);
loopEmitter.enterLoopOverTensorAtLvl(rewriter, loc, tids, l, reduc);
}		}

SmallVector<Value> lcvs;		SmallVector<Value> lcvs;
lcvs.reserve(lvlRank);		lcvs.reserve(lvlRank);
loopEmitter.getLoopIVs(lcvs);		loopEmitter.getLoopIVs(lcvs);

if (op.getOrder()) {		if (op.getOrder()) {
// FIXME: There is some dim/lvl confusion here since `dimRank != lvlRank`		// FIXME: There is some dim/lvl confusion here since `dimRank != lvlRank`
▲ Show 20 Lines • Show All 180 Lines • Show Last 20 Lines

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp

Show First 20 Lines • Show All 1,290 Lines • ▼ Show 20 Lines	static bool isParallelFor(CodegenEnv &env, bool isOuter, bool isSparse) {
case SparseParallelizationStrategy::kAnyStorageAnyLoop:		case SparseParallelizationStrategy::kAnyStorageAnyLoop:
return true;		return true;
}		}
llvm_unreachable("unexpected parallelization strategy");		llvm_unreachable("unexpected parallelization strategy");
}		}

/// Generates a for-loop on a single index.		/// Generates a for-loop on a single index.
static Operation *genFor(CodegenEnv &env, OpBuilder &builder, bool isOuter,		static Operation *genFor(CodegenEnv &env, OpBuilder &builder, bool isOuter,
bool isInner, LoopId ldx, ArrayRef<TensorId> tids,		bool isInner, LoopId ldx,
ArrayRef<Level> lvls) {		ArrayRef<TensorLevel> tidLvls) {
linalg::GenericOp op = env.op();		linalg::GenericOp op = env.op();
Location loc = op.getLoc();		Location loc = op.getLoc();
auto iteratorTypes = op.getIteratorTypesArray();		auto iteratorTypes = op.getIteratorTypesArray();
bool isSparse = llvm::any_of(tids, [ldx, &env](TensorId tid) {		bool isSparse = llvm::any_of(tidLvls, [ldx, &env](TensorLevel tidLvl) {
const auto dlt = env.dlt(tid, ldx);		// Queries the DLT based on the tensor id and loop idx, as requested by
		wrengrUnsubmitted Done Reply Inline Actions It seems worth adding a comment explaining why we discard `tlPair.second` and use `ldx` instead; i.e., a comment saying that `isSparse` is checking whether any tensor is sparse for the loop `ldx`, regardless of what level that corresponds to and regardless of what level is stored in `tidLvls`. (Or, if the levels in `tidLvls` are supposed to agree with the loop, then say that instead. Assuming we actually maintain that invariant, then we don't need to add assertions to verify that; but we do want a comment explaining that this invariant holds.) wrengr: It seems worth adding a comment explaining why we discard `tlPair.second` and use `ldx` instead…
		// `CodegenEnv::dlt(TensorId, LoopIdx)`. The returned DLT from CodegenEnv
		// should be consistent with the DLT indexed by <TensorId, Level>.
		const auto dlt = env.dlt(env.toTidLvlPair(tidLvl).first, ldx);
		wrengrUnsubmitted Done Reply Inline Actions should be "tidLvl" to match the names used elsewhere wrengr: should be "tidLvl" to match the names used elsewhere
return isCompressedDLT(dlt) \|\| isSingletonDLT(dlt);		return isCompressedDLT(dlt) \|\| isSingletonDLT(dlt);
});		});

bool isParallel = isParallelFor(env, isOuter, isSparse);		bool isParallel = isParallelFor(env, isOuter, isSparse);

Operation loop = env.genLoopBoundary([&](MutableArrayRef<Value> reduc) {		Operation loop = env.genLoopBoundary([&](MutableArrayRef<Value> reduc) {
if (env.merger().isFilterLoop(ldx)) {		if (env.merger().isFilterLoop(ldx)) {
const TensorId tid = tids.front();		const auto [tid, lvl] = env.toTidLvlPair(tidLvls.front());
const Level lvl = lvls.front();
// tids/lvls must only have one value because filter loops only		// tids/lvls must only have one value because filter loops only
// corresponding to the one and only sparse tensor level.		// corresponding to the one and only sparse tensor level.
assert(isSparse && tids.size() == 1 && lvls.size() == 1);		assert(isSparse && tidLvls.size() == 1);
OpOperand *t = &op->getOpOperand(tid);		OpOperand *t = &op->getOpOperand(tid);
auto enc = getSparseTensorEncoding(t->get().getType());		auto enc = getSparseTensorEncoding(t->get().getType());
// Retrieves the affine expression for the filter loop.		// Retrieves the affine expression for the filter loop.
// FIXME: `toOrigDim` is deprecated.		// FIXME: `toOrigDim` is deprecated.
AffineExpr a =		AffineExpr a =
op.getMatchingIndexingMap(t).getResult(toOrigDim(enc, lvl));		op.getMatchingIndexingMap(t).getResult(toOrigDim(enc, lvl));
return env.emitter().enterFilterLoopOverTensorAtLvl(builder, loc, tid,		return env.emitter().enterFilterLoopOverTensorAtLvl(builder, loc, tid,
lvl, a, reduc);		lvl, a, reduc);
}		}
return env.emitter().enterLoopOverTensorAtLvl(builder, loc, tids, lvls,		return env.emitter().enterLoopOverTensorAtLvl(builder, loc, tidLvls, reduc,
reduc, isParallel);		isParallel);
});		});
assert(loop);		assert(loop);
return loop;		return loop;
}		}

/// Emit a while-loop for co-iteration over multiple indices.		/// Emit a while-loop for co-iteration over multiple indices.
static Operation *genWhile(CodegenEnv &env, OpBuilder &builder, LoopId idx,		static Operation *genWhile(CodegenEnv &env, OpBuilder &builder, LoopId idx,
bool needsUniv, ArrayRef<TensorId> tids,		bool needsUniv, ArrayRef<TensorLevel> tidLvls) {
ArrayRef<Level> lvls) {
Operation loop = env.genLoopBoundary([&](MutableArrayRef<Value> reduc) {		Operation loop = env.genLoopBoundary([&](MutableArrayRef<Value> reduc) {
// Construct the while-loop with a parameter for each		// Construct the while-loop with a parameter for each
// index.		// index.
return env.emitter().enterCoIterationOverTensorsAtLvls(		return env.emitter().enterCoIterationOverTensorsAtLvls(
builder, env.op().getLoc(), tids, lvls, needsUniv, reduc);		builder, env.op().getLoc(), tidLvls, needsUniv, reduc);
});		});
assert(loop);		assert(loop);
return loop;		return loop;
}		}

/// Generates a for-loop or a while-loop, depending on whether it implements		/// Generates a for-loop or a while-loop, depending on whether it implements
/// singleton iteration or co-iteration over the given conjunction.		/// singleton iteration or co-iteration over the given conjunction.
static Operation *genLoop(CodegenEnv &env, OpBuilder &builder, LoopOrd at,		static Operation *genLoop(CodegenEnv &env, OpBuilder &builder, LoopOrd at,
bool needsUniv, ArrayRef<TensorId> tids,		bool needsUniv, ArrayRef<TensorLevel> tidLvls,
ArrayRef<Level> lvls, bool isFor) {		bool isFor) {
assert(tids.size() == lvls.size());
const LoopId idx = env.topSortAt(at);		const LoopId idx = env.topSortAt(at);
if (isFor) {		if (isFor) {
bool isOuter = at == 0;		bool isOuter = at == 0;
bool isInner = at == env.topSortSize() - 1;		bool isInner = at == env.topSortSize() - 1;
return genFor(env, builder, isOuter, isInner, idx, tids, lvls);		return genFor(env, builder, isOuter, isInner, idx, tidLvls);
}		}
return genWhile(env, builder, idx, needsUniv, tids, lvls);		return genWhile(env, builder, idx, needsUniv, tidLvls);
}		}

/// Generates the induction structure for a while-loop.		/// Generates the induction structure for a while-loop.
static void finalizeWhileOp(CodegenEnv &env, OpBuilder &builder, LoopId idx,		static void finalizeWhileOp(CodegenEnv &env, OpBuilder &builder, LoopId idx,
bool needsUniv, scf::WhileOp whileOp) {		bool needsUniv, scf::WhileOp whileOp) {
Location loc = env.op().getLoc();		Location loc = env.op().getLoc();
// Finalize each else branch of all if statements.		// Finalize each else branch of all if statements.
if (env.isReduc() \|\| env.isExpand() \|\| env.getInsertionChain()) {		if (env.isReduc() \|\| env.isExpand() \|\| env.getInsertionChain()) {
▲ Show 20 Lines • Show All 106 Lines • ▼ Show 20 Lines	static bool startLoopSeq(CodegenEnv &env, OpBuilder &builder, ExprId exp,
// Emit invariants at this loop sequence level.		// Emit invariants at this loop sequence level.
genInvariants(env, builder, exp, ldx, /atStart=/true);		genInvariants(env, builder, exp, ldx, /atStart=/true);
// Emit access pattern expansion for sparse tensor output.		// Emit access pattern expansion for sparse tensor output.
genExpand(env, builder, at, /atStart=/true);		genExpand(env, builder, at, /atStart=/true);
// Emit further intitialization at this loop sequence level.		// Emit further intitialization at this loop sequence level.
const LatPointId l0 = env.set(lts)[0];		const LatPointId l0 = env.set(lts)[0];
bool needsUniv = false;		bool needsUniv = false;

SmallVector<TensorId> tids;		SmallVector<TensorLevel> tidLvls;
SmallVector<Level> lvls;
env.merger().foreachTensorLoopId(l0, [&](TensorLoopId b, TensorId tid,		env.merger().foreachTensorLoopId(l0, [&](TensorLoopId b, TensorId tid,
std::optional<Level> lvl,		std::optional<Level> lvl,
DimLevelType dlt, bool isIdxReduc) {		DimLevelType dlt, bool isIdxReduc) {
assert(env.merger().loop(b) == idx);		assert(env.merger().loop(b) == idx);
if (isDenseDLT(dlt) \|\| isUndefDLT(dlt))		if (isDenseDLT(dlt) \|\| isUndefDLT(dlt))
needsUniv = true;		needsUniv = true;
if (isCompressedDLT(dlt) \|\| isSingletonDLT(dlt) \|\|		if (isCompressedDLT(dlt) \|\| isSingletonDLT(dlt) \|\|
isCompressedWithHiDLT(dlt) \|\| isIdxReduc) {		isCompressedWithHiDLT(dlt) \|\| isIdxReduc) {
// Only when this is a index reduction loop, can the dlt be undefined.		// Only when this is a index reduction loop, can the dlt be undefined.
assert(!isUndefDLT(dlt) \|\| isIdxReduc);		assert(!isUndefDLT(dlt) \|\| isIdxReduc);
// sparse/singleton levels, or a dense/sparse index reduction loop.		// sparse/singleton levels, or a dense/sparse index reduction loop.
tids.push_back(tid);		tidLvls.emplace_back(env.makeTensorLevel(tid, *lvl));
		wrengrUnsubmitted Done Reply Inline Actions `push_back` wrengr: `push_back`
lvls.push_back(*lvl);
}		}
});		});

env.emitter().enterNewLoopSeq(builder, env.op().getLoc(), tids, lvls);		env.emitter().enterNewLoopSeq(builder, env.op().getLoc(), tidLvls);

// Maintain the universal index only if it is actually		// Maintain the universal index only if it is actually
// consumed by a subsequent lattice point.		// consumed by a subsequent lattice point.
if (needsUniv) {		if (needsUniv) {
for (const LatPointId li : env.set(lts).drop_front())		for (const LatPointId li : env.set(lts).drop_front())
if (!env.merger().hasAnySparse(env.lat(li).simple))		if (!env.merger().hasAnySparse(env.lat(li).simple))
return true;		return true;
}		}
Show All 14 Lines	if (enc) {
const TensorId tid = env.makeTensorId(input->getOperandNumber());		const TensorId tid = env.makeTensorId(input->getOperandNumber());
const Level lvlRank = enc.getLvlRank();		const Level lvlRank = enc.getLvlRank();
assert(lvlExprs.size() == static_cast<size_t>(lvlRank));		assert(lvlExprs.size() == static_cast<size_t>(lvlRank));
// FIXME: there is dim/lvl confusion here		// FIXME: there is dim/lvl confusion here
for (Level l = startLvl; l < lvlRank; l++) {		for (Level l = startLvl; l < lvlRank; l++) {
// FIXME: `toOrigDim` is deprecated.		// FIXME: `toOrigDim` is deprecated.
AffineExpr lvlExpr = lvlExprs[toOrigDim(enc, l)];		AffineExpr lvlExpr = lvlExprs[toOrigDim(enc, l)];
if (enc.isDenseLvl(l) && lvlExpr.isa<AffineConstantExpr>())		if (enc.isDenseLvl(l) && lvlExpr.isa<AffineConstantExpr>())
env.emitter().genDenseAffineAddress(builder, loc, tid, l, lvlExpr);		env.emitter().genDenseAffineAddress(
		builder, loc, env.makeTensorLevel(tid, l), lvlExpr);
else		else
return; // break on first non-dense non-constant level		return; // break on first non-dense non-constant level
}		}
}		}
}		}

static void genInitConstantDenseAddress(CodegenEnv &env,		static void genInitConstantDenseAddress(CodegenEnv &env,
RewriterBase &rewriter) {		RewriterBase &rewriter) {
// We can generate address for constant affine expression before any loops		// We can generate address for constant affine expression before any loops
// starting from the first level as they do not depend on any thing.		// starting from the first level as they do not depend on any thing.
// E.g., [Dense, Dense, Sparse] -> (1, 2, d0), the addresses for the first two		// E.g., [Dense, Dense, Sparse] -> (1, 2, d0), the addresses for the first two
// levels can be determined before loops.		// levels can be determined before loops.
for (TensorId tid = 0, e = env.op().getNumDpsInputs(); tid < e; tid++)		for (TensorId tid = 0, e = env.op().getNumDpsInputs(); tid < e; tid++)
genConstantDenseAddressFromLevel(env, rewriter, tid, 0);		genConstantDenseAddressFromLevel(env, rewriter, tid, 0);
}		}

/// Return true if the lattices bit can be iterated by a for loop.		/// Return true if the lattices bit can be iterated by a for loop.
static bool translateBitsToTidLvlPairs(		static bool
CodegenEnv &env, LatPointId li, LoopId ldx, SmallVectorImpl<TensorId> &tids,		translateBitsToTidLvlPairs(CodegenEnv &env, LatPointId li, LoopId ldx,
SmallVectorImpl<Level> &lvls, SmallVectorImpl<TensorId> &affineTids,		SmallVectorImpl<TensorLevel> &tidLvls,
		wrengrUnsubmitted Done Reply Inline Actions Why `condTidLvls` instead of just `tidLvls`? (I'm not saying it should be changed, just curious why the different name) wrengr: Why `condTidLvls` instead of just `tidLvls`? (I'm not saying it should be changed, just curious…
		PeimingAuthorUnsubmitted Done Reply Inline Actions Because the callsites was using `condTid` (to distinguish from `affineTid`). but yeah, `cond-` is not a accurate prefix, since not all `tid, lvl` are appeared in the loop condition (most of the dense levels are not). I changed it to `tidLvls` Peiming: Because the callsites was using `condTid` (to distinguish from `affineTid`). but yeah, `cond-`…
SmallVectorImpl<Level> &affineLvls, SmallVectorImpl<AffineExpr> &exps) {		SmallVectorImpl<TensorLevel> &affineTidLvls,
		SmallVectorImpl<AffineExpr> &exps) {
const BitVector &simple = env.lat(li).simple;		const BitVector &simple = env.lat(li).simple;
const TensorId outTid = env.merger().getOutTensorID();		const TensorId outTid = env.merger().getOutTensorID();
const std::optional<Level> outLvl = env.merger().getLvl(outTid, ldx);		const std::optional<Level> outLvl = env.merger().getLvl(outTid, ldx);

unsigned numloopCond = 0;		unsigned numloopCond = 0;
bool hasNonUnique = false;		bool hasNonUnique = false;

env.merger().foreachTensorLoopId(		env.merger().foreachTensorLoopId(
li, [&, ldx](TensorLoopId b, TensorId tid, std::optional<Level> lvl,		li, [&, ldx](TensorLoopId b, TensorId tid, std::optional<Level> lvl,
DimLevelType dlt, bool isIdxReduc) {		DimLevelType dlt, bool isIdxReduc) {
if (simple[b]) {		if (simple[b]) {
if (isIdxReduc) {		if (isIdxReduc) {
tids.push_back(tid);		tidLvls.emplace_back(env.makeTensorLevel(tid, *lvl));
		wrengrUnsubmitted Done Reply Inline Actions `push_back` wrengr: `push_back`
lvls.push_back(*lvl);
numloopCond++;		numloopCond++;
return;		return;
}		}
if (isUndefDLT(dlt)) {		if (isUndefDLT(dlt)) {
// An undefined dlt in the lattices, we probably mean to		// An undefined dlt in the lattices, we probably mean to
// iterate based on the level of output tensor. E.g., this		// iterate based on the level of output tensor. E.g., this
// could be a synthetic tensor (for invariants and sparse		// could be a synthetic tensor (for invariants and sparse
// output tensor).		// output tensor).
// out[i][j] = invariant; or a broadcast		// out[i][j] = invariant; or a broadcast
// out[i][j] = in[i] (j is undef for input)		// out[i][j] = in[i] (j is undef for input)
tid = outTid;		tid = outTid;
lvl = outLvl;		lvl = outLvl;
// Skips invalid lvl (e.g., when this is a zero ranked tensor).		// Skips invalid lvl (e.g., when this is a zero ranked tensor).
if (!lvl)		if (!lvl)
return;		return;
}		}
hasNonUnique = !isUniqueDLT(dlt) \|\| hasNonUnique;		hasNonUnique = !isUniqueDLT(dlt) \|\| hasNonUnique;
tids.push_back(tid);		tidLvls.emplace_back(env.makeTensorLevel(tid, *lvl));
		wrengrUnsubmitted Done Reply Inline Actions `push_back` wrengr: `push_back`
lvls.push_back(*lvl);
numloopCond++;		numloopCond++;
} else if (isDenseDLT(dlt) \|\| isIdxReduc) {		} else if (isDenseDLT(dlt) \|\| isIdxReduc) {
tids.push_back(tid);		tidLvls.emplace_back(env.makeTensorLevel(tid, *lvl));
		wrengrUnsubmitted Done Reply Inline Actions `push_back` wrengr: `push_back`
lvls.push_back(*lvl);
} else {		} else {
assert(isUndefDLT(dlt));		assert(isUndefDLT(dlt));
linalg::GenericOp op = env.op();		linalg::GenericOp op = env.op();
if (tid >= op.getNumDpsInputs())		if (tid >= op.getNumDpsInputs())
// We only handle affine expression on input tensors (for now).		// We only handle affine expression on input tensors (for now).
return;		return;
OpOperand *operand = &op->getOpOperand(tid);		OpOperand *operand = &op->getOpOperand(tid);
const auto stt = getSparseTensorType(operand->get());		const auto stt = getSparseTensorType(operand->get());
Show All 21 Lines	env.merger().foreachTensorLoopId(
// level. We need to generate the address according to the		// level. We need to generate the address according to the
// affine expression. This is also the best place we can do it		// affine expression. This is also the best place we can do it
// to avoid putting it inside inner loops.		// to avoid putting it inside inner loops.
// NOTE: It assumes that the levels of the input tensor are		// NOTE: It assumes that the levels of the input tensor are
// initialized in order (and it is also currently guaranteed by		// initialized in order (and it is also currently guaranteed by
// computeIterationGraph), another more admissible approach		// computeIterationGraph), another more admissible approach
// might be accepting out-of-order access between consecutive		// might be accepting out-of-order access between consecutive
// dense levels.		// dense levels.
affineTids.push_back(tid);		affineTidLvls.emplace_back(env.makeTensorLevel(tid, l));
		wrengrUnsubmitted Done Reply Inline Actions `push_back` wrengr: `push_back`
affineLvls.push_back(l);
exps.push_back(exp);		exps.push_back(exp);
}		}
}		}
}		}
}		}
});		});

if (isDenseDLT(env.dlt(outTid, ldx))) {		if (isDenseDLT(env.dlt(outTid, ldx))) {
// Note that we generate dense indices of the output tensor		// Note that we generate dense indices of the output tensor
// unconditionally, since they may not appear in the lattice, but may be		// unconditionally, since they may not appear in the lattice, but may be
// needed for linearized env.		// needed for linearized env.
tids.push_back(outTid);		tidLvls.emplace_back(env.makeTensorLevel(outTid, *outLvl));
		wrengrUnsubmitted Done Reply Inline Actions `push_back` wrengr: `push_back`
lvls.push_back(*outLvl);
}		}

assert(numloopCond > 0);		assert(numloopCond > 0);
// If we just need to one loop conditions and the conditions is not imposed on		// If we just need to one loop conditions and the conditions is not imposed on
// non-unique level, the loop can be generated by a for loop.		// non-unique level, the loop can be generated by a for loop.
return numloopCond == 1 && !hasNonUnique;		return numloopCond == 1 && !hasNonUnique;
}		}

/// Starts a single loop in current sequence.		/// Starts a single loop in current sequence.
static std::pair<Operation *, bool> startLoop(CodegenEnv &env,		static std::pair<Operation *, bool> startLoop(CodegenEnv &env,
OpBuilder &builder, LoopOrd at,		OpBuilder &builder, LoopOrd at,
LatPointId li, bool needsUniv) {		LatPointId li, bool needsUniv) {
// The set of tensors + lvls to generate loops on		// The set of tensors + lvls to generate loops on
SmallVector<TensorId> tids, affineTids;		SmallVector<TensorLevel> tidLvls;
SmallVector<Level> lvls, affineLvls;
// The set of dense tensors with non-trivial affine expression that just		// The set of dense tensors with non-trivial affine expression that just
// becomes invariant and the address shall now be generated at the current		// becomes invariant and the address shall now be generated at the current
// level.		// level.
		SmallVector<TensorLevel> affineTidLvls;
		wrengrUnsubmitted Done Reply Inline Actions I think the `affineTidLvls` and `affines` should actually be combined into a single `SmallVector<TLA>` where `struct TLA final { TensorId; Level; AffineExpr }` (or `struct TLA final {TensorLevel; AffineExpr}` with the appropriate getters). That helps capture the invariant that `translateBitsToTidLvlPairs` keeps the the same length, and thus also helps avoid needing to `zip` them together later on Of course, you'll need to come up with a better name than my "TLA" ;) wrengr: I think the `affineTidLvls` and `affines` should actually be combined into a single…
		PeimingAuthorUnsubmitted Done Reply Inline Actions I want to keep it, because the `affineTidLvls` are used independently from `affines` later. See L1662 Peiming: I want to keep it, because the `affineTidLvls` are used independently from `affines` later. See…
SmallVector<AffineExpr> affines;		SmallVector<AffineExpr> affines;
bool isSingleCond = translateBitsToTidLvlPairs(		bool isSingleCond = translateBitsToTidLvlPairs(
env, li, env.topSortAt(at), tids, lvls, affineTids, affineLvls, affines);		env, li, env.topSortAt(at), tidLvls, affineTidLvls, affines);

// Emit the for/while-loop control.		// Emit the for/while-loop control.
Operation *loop =		Operation *loop = genLoop(env, builder, at, needsUniv, tidLvls, isSingleCond);
genLoop(env, builder, at, needsUniv, tids, lvls, isSingleCond);
Location loc = env.op().getLoc();		Location loc = env.op().getLoc();
for (auto [tid, lvl, exp] : llvm::zip(affineTids, affineLvls, affines)) {		for (auto [tidLvl, exp] : llvm::zip(affineTidLvls, affines)) {
env.emitter().genDenseAffineAddress(builder, loc, tid, lvl, exp);		env.emitter().genDenseAffineAddress(builder, loc, tidLvl, exp);
		wrengrUnsubmitted Done Reply Inline Actions I still think it'd be better to have a single `SmallVector<std::pair<TensorLevel, AffineExpr>>` since that helps capture the invariant that these should be the same length. But if you're not going to do that, then you should use `llvm::zip_equal` so that it checks that they have the same length rather than silently truncating things to the shorter list wrengr: I still think it'd be better to have a single `SmallVector<std::pair<TensorLevel, AffineExpr>>`…
		PeimingAuthorUnsubmitted Done Reply Inline Actions Good to know there is a `zip_equal`. Peiming: Good to know there is a `zip_equal`.
}		}

// Until now, we have entered every <tid, lvl> pair in {cond, extra,		// Until now, we have entered every <tid, lvl> pair in {cond, extra,
// affine}Tids/Lvls. The addresses of the upcoming levels which are dependent		// affine}Tids/Lvls. The addresses of the upcoming levels which are dependent
// on constant affines expression may now be determined.		// on constant affines expression may now be determined.
auto allTids = llvm::concat<TensorId>(tids, affineTids);		auto allTidLvls = llvm::concat<TensorLevel>(tidLvls, affineTidLvls);
auto allLvls = llvm::concat<Level>(lvls, affineLvls);		for (TensorLevel tidLvl : allTidLvls) {
for (auto [tid, lvl] : llvm::zip(allTids, allLvls)) {		auto [tid, lvl] = env.toTidLvlPair(tidLvl);
		wrengrUnsubmitted Done Reply Inline Actions If this concat is what's making you reluctant to use `SmallVector<std::pair<TensorLevel, AffineExpr>>`, then recall that you can use `llvm::concat<TensorLevel>(tidLvls, llvm::make_first_range(affineTidLvlExprs))`. wrengr: If this concat is what's making you reluctant to use `SmallVector<std::pair<TensorLevel…
if (tid != env.merger().getOutTensorID())		if (tid != env.merger().getOutTensorID())
genConstantDenseAddressFromLevel(env, builder, tid, lvl + 1);		genConstantDenseAddressFromLevel(env, builder, tid, lvl + 1);
}		}

return std::make_pair(loop, isSingleCond);		return std::make_pair(loop, isSingleCond);
}		}

/// Ends a single loop in current sequence. Returns new values for needsUniv.		/// Ends a single loop in current sequence. Returns new values for needsUniv.
▲ Show 20 Lines • Show All 278 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse] group tensor id and levels into pairs in loop emitterClosedPublic

Details

Diff Detail

Event Timeline