Please use GitHub pull requests for new patches. Avoid migrating existing patches. Phabricator shutdown timeline
Changeset View
Standalone View
mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.h
Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines | public: | ||||
using OutputUpdater = function_ref<Value(OpBuilder &builder, Location loc, | using OutputUpdater = function_ref<Value(OpBuilder &builder, Location loc, | ||||
Value memref, Value tensor)>; | Value memref, Value tensor)>; | ||||
// Map from [tid, dim] to a list of dependent [tid, dim] for affine expression | // Map from [tid, dim] to a list of dependent [tid, dim] for affine expression | ||||
// index on sparse tensors. | // index on sparse tensors. | ||||
// E.g., for affine index (d0 + d1), it depends on two [tid, dim] that defines | // E.g., for affine index (d0 + d1), it depends on two [tid, dim] that defines | ||||
// d0 and d1 (for affine expression reduction). | // d0 and d1 (for affine expression reduction). | ||||
// If the list is empty, it means that there is no affine expression on the | // If the list is empty, it means that there is no affine expression on the | ||||
// input [tid, dim]. | // input [tid, dim]. | ||||
// NOTE: The caller is responsible to ensure that the order of the returned | |||||
aartbik: comments with should or must are a bit dangling unless you say what happens when this… | |||||
I think what is still missing is whether it is enforced (viz. asserts fail when trying to set it) So, something like Clients are responsible for ensuring that the order of the returned (I think my original comment was really on when I see "should" or "must", who is to blame in the end ;-) aartbik: I think what is still missing is whether it is enforced (viz. asserts fail when trying to set… | |||||
// list to be consistent with the topological order of the iteration graph, | |||||
// otherwise the loop emitter might reduce a wrong dependent index variable | |||||
// when generating slice-driven loops. | |||||
using DependentLvlGetter = | using DependentLvlGetter = | ||||
function_ref<std::vector<std::pair<TensorId, Level>>(TensorId, Level)>; | function_ref<std::vector<std::pair<TensorId, Level>>(TensorId, Level)>; | ||||
LoopEmitter() = default; | LoopEmitter() = default; | ||||
/// Takes an array of input tensors, which the generated loops will | /// Takes an array of input tensors, which the generated loops will | ||||
/// iterate over. Each tensor is given a `TensorId` (numerically equal | /// iterate over. Each tensor is given a `TensorId` (numerically equal | ||||
/// to the position of that tensor `Value` in the array). Setting | /// to the position of that tensor `Value` in the array). Setting | ||||
Show All 35 Lines | public: | ||||
/// for (i = p0; i < end; i++) | /// for (i = p0; i < end; i++) | ||||
/// ... | /// ... | ||||
/// // loop sequence end. | /// // loop sequence end. | ||||
/// } | /// } | ||||
void enterNewLoopSeq(OpBuilder &builder, Location loc, | void enterNewLoopSeq(OpBuilder &builder, Location loc, | ||||
ArrayRef<TensorId> tids, ArrayRef<Level> lvls); | ArrayRef<TensorId> tids, ArrayRef<Level> lvls); | ||||
/// Exits the current loop sequence, this will reset universal index to 0. | /// Exits the current loop sequence, this will reset universal index to 0. | ||||
void exitCurrentLoopSeq() { | void exitCurrentLoopSeq(OpBuilder &builder, Location loc); | ||||
Why did you change Exits -> exit? Original seems okay aartbik: Why did you change Exits -> exit? Original seems okay | |||||
This should remain "/// Exits" wrengr: This should remain "/// Exits" | |||||
You want to keep the triple-slash "///", since that's what the tooling uses for generating the API documentation wrengr: You want to keep the triple-slash "///", since that's what the tooling uses for generating the… | |||||
assert(loopSeqStack.size() == loopStack.size() + 1); | |||||
loopSeqStack.pop_back(); | |||||
} | |||||
// TODO: Get rid of `lvls` in the argument list? Track the level we | // TODO: Get rid of `lvls` in the argument list? Track the level we | ||||
// are currently at internally. Then it would be enterNextLvlForTensor. | // are currently at internally. Then it would be enterNextLvlForTensor. | ||||
// Still need a way to specify the lvl for non-annotated tensors though, | // Still need a way to specify the lvl for non-annotated tensors though, | ||||
// as those can be accessed out of order. | // as those can be accessed out of order. | ||||
// | // | ||||
/// Emits loop over tensor_tid_lvl, it assumes that loops between | /// Emits loop over tensor_tid_lvl, it assumes that loops between | ||||
/// tensor_tid_[0, lvl - 1] have already been generated. | /// tensor_tid_[0, lvl - 1] have already been generated. | ||||
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines | const std::vector<std::vector<Value>> &getCoordinateBuffers() const { | ||||
return coordinatesBuffers; | return coordinatesBuffers; | ||||
}; | }; | ||||
const std::vector<Value> &getValBuffer() const { return valBuffer; }; | const std::vector<Value> &getValBuffer() const { return valBuffer; }; | ||||
constexpr static llvm::StringLiteral getLoopEmitterLoopAttrName() { | constexpr static llvm::StringLiteral getLoopEmitterLoopAttrName() { | ||||
return llvm::StringLiteral("Emitted from"); | return llvm::StringLiteral("Emitted from"); | ||||
} | } | ||||
private: | private: | ||||
struct LoopInfo { | // LoopInfo stores information of a loop generated by LoopEmitter. E.g., | ||||
since we have several nested structs, can you give each a short comment (as documentation, and to improve readability( aartbik: since we have several nested structs, can you give each a short comment (as documentation, and… | |||||
LoopInfo(ArrayRef<TensorId> tids, ArrayRef<Level> lvls, Operation *loop, | // the set of tensors levels that the loop is iterating over. | ||||
Block *userBlock, Value iv, StringAttr loopTag) | struct LoopInfo final { | ||||
: tids(tids), lvls(lvls), loop(loop), userCodeBlock(userBlock), iv(iv) { | LoopInfo(ArrayRef<TensorId> tids, ArrayRef<Level> lvls, | ||||
ArrayRef<TensorId> slicedTids, ArrayRef<Level> slicedLvls, | |||||
ArrayRef<bool> sliceReduced, Operation *loop, Block *userBlock, | |||||
Value iv, StringAttr loopTag) | |||||
: tids(tids), lvls(lvls), slicedTids(slicedTids), | |||||
slicedLvls(slicedLvls), sliceReduced(sliceReduced), loop(loop), | |||||
userCodeBlock(userBlock), iv(iv) { | |||||
// Attached a special tag to loop emitter generated loop. | // Attached a special tag to loop emitter generated loop. | ||||
if (loopTag) | if (loopTag) | ||||
loop->setAttr(LoopEmitter::getLoopEmitterLoopAttrName(), loopTag); | loop->setAttr(LoopEmitter::getLoopEmitterLoopAttrName(), loopTag); | ||||
} | } | ||||
// TODO: maybe use a vector<pair> for tid and lvl? | // TODO: maybe use a vector<pair> for tid and lvl? | ||||
// (Better yet, compress them together a la `TensorLoopId`.) | // (Or compress them together with a `TensorLoopId`.) | ||||
// The set of tensors that the loop is operating on | // The set of tensors that the loop is operating on | ||||
const llvm::SmallVector<TensorId> tids; | const llvm::SmallVector<TensorId> tids; | ||||
// The corresponding levels for the tensors | // The corresponding levels for the tensors | ||||
const llvm::SmallVector<Level> lvls; | const llvm::SmallVector<Level> lvls; | ||||
// The set of tensors for slice-driven loop conditions. | |||||
Here and below (and above), period at end aartbik: Here and below (and above), period at end | |||||
This comment should explain how exactly the slicedTids differs from the other tids. Also, is the same tensor allowed to occur in both fields? If so, then what does that mean? and, can the same level of the same tensor occur on both of the corresponding fields? wrengr: This comment should explain how exactly the `slicedTids` differs from the other `tids`. Also… | |||||
const llvm::SmallVector<TensorId> slicedTids; | |||||
// The corresponding level for slice-driven tensors. | |||||
"levels" wrengr: "levels" | |||||
const llvm::SmallVector<Level> slicedLvls; | |||||
// Whether the tensor is fully reduced (e.g., i + j => j). | |||||
This comment is wrong for this field wrengr: This comment is wrong for this field | |||||
const llvm::SmallVector<bool> sliceReduced; | |||||
I think it'd be better to combine these all into a single const SmallVector<LoopLevelInfo> where struct LoopLevelInfo final { TensorId tid; Level lvl; bool isSlice; bool isReduced; }; —assuming it's okay to combine the original (tids,lvls) with the new (slicedTids,slicedLvls,sliceReduced) into a single vector/set. If that's not okay for some reason, then I still think it'd be good to use a single const SmallVector<LoopSlicedLevelInfo> field for the new stuff. Using AOS will ensure that there's the right number of all the things that should correspond, as well as keeping the corresponding things close together. Plus it'll make it easier to add additional fields in the future as needed. wrengr: I think it'd be better to combine these all into a single `const SmallVector<LoopLevelInfo>`… | |||||
I agree with you on this, in fact, see my comment at L222, I will do it in a separate patch though. Peiming: I agree with you on this, in fact, see my comment at L222, I will do it in a separate patch… | |||||
const Operation *loop; // the loop operation | const Operation *loop; // the loop operation | ||||
Block *const userCodeBlock; // the block holding users' generated code. | Block *const userCodeBlock; // the block holding users' generated code. | ||||
const Value iv; // the induction variable for the loop | const Value iv; // the induction variable for the loop | ||||
}; | }; | ||||
/// Linearizes address for dense level (i.e., p = (i * d0) + j). | // SliceInfo stores information of an extracted slice for slice-driven loop. | ||||
// E.g., the in-scope SSA values for the minimum coordinates and offset for | |||||
"...do not need to actually create a sparse..." wrengr: "...do not need to actually create a sparse..." | |||||
// the slice, etc. | |||||
"...only need to maintain the..." wrengr: "...only need to maintain the..." | |||||
struct SliceInfo final { | |||||
Is this actually the full MemRef of coordinates for all levels? If so then it should be "minCoords" (with an "s"). Whereas if it's just a single coordinate for the given level, then it should be "minCrd". wrengr: Is this actually the full `MemRef` of coordinates for all levels? If so then it should be… | |||||
// Note that we do not need to create a actual sparse tensor slice but | |||||
I'm guessing this should be LoopOrd. If not, then what is it? wrengr: I'm guessing this should be `LoopOrd`. If not, then what is it? | |||||
// instead only need to maintain the metadata of the slice. | |||||
SliceInfo(Value minCrd, Value offset, Value isNonEmpty, | |||||
std::optional<Level> slicedOnLvl, unsigned depth) | |||||
Given our discussion of boolean blindness, this assertion suggests that the two parameters should be combined into a single std::optional<std::pair<Level, Value>> parameter. (Albeit you'll still want to assert you didn't get a null Value.) If that combined parameter doesn't work, why not? The only other thing that would be consistent with the assertion is union{ non-null-Value; struct{non-null-Value; Level}} but that's equivalent to struct{non-null-Value; optional<Level>}, so if that's what you want then you should assert(minCoord) instead. wrengr: Given our discussion of boolean blindness, this assertion suggests that the two parameters… | |||||
It should work, I will add a TODO here and submit the change in a separate patch. Peiming: It should work, I will add a TODO here and submit the change in a separate patch. | |||||
: minCrd(minCrd), offset(offset), isNonEmpty(isNonEmpty), | |||||
slicedOnLvl(slicedOnLvl), depth(depth) { | |||||
// TODO: use std::optional<pair<Level, minCrd>> | |||||
assert(!slicedOnLvl || minCrd); | |||||
Given the comment, I think this would be better named something like "isFirstSlice" or "isInitialSlice". wrengr: Given the comment, I think this would be better named something like "isFirstSlice" or… | |||||
I will change the comment, it means whether it is the initial tensor that has not yet been sliced. Peiming: I will change the comment, it means whether it is the initial tensor that has not yet been… | |||||
} | |||||
If the value is just a single coordinate, then this should also be singular. wrengr: If the value is just a single coordinate, then this should also be singular. | |||||
// Whether this is the tensor that has not yet been sliced. | |||||
bool isInitialTensor() const { return !slicedOnLvl.has_value(); } | |||||
Value minCrd; // the minimum coordinate of the slice. | |||||
Wren can correct me if I am wrong, but I think this needs to be minimum, right (as in smallest value, and not the lowest according to some other measure)? aartbik: Wren can correct me if I am wrong, but I think this needs to be minimum, right (as in smallest… | |||||
Value offset; // the offset of the current slice. | |||||
Value isNonEmpty; // whether the slice is empty. | |||||
std::optional<Level> slicedOnLvl; // the level on which the slice is done | |||||
unsigned depth; // the depth (relative to dependentDimMap[tid][lvl]). | |||||
}; | |||||
/// Linearizes address for dense dimension (i.e., p = (i * d0) + j). | |||||
Can you make this class "final" (ditto for the LoopInfo). You don't need/use subclassing here, and marking it final helps the compiler generate more efficient code wrengr: Can you make this class "`final`" (ditto for the `LoopInfo`). You don't need/use subclassing… | |||||
Value genAddress(OpBuilder &builder, Location loc, TensorId tid, Level lvl, | Value genAddress(OpBuilder &builder, Location loc, TensorId tid, Level lvl, | ||||
Value iv); | Value iv); | ||||
/// Generates the segment high for a non-unique level (to fast forward | /// Generates the segment high for a non-unique level (to fast forward | ||||
/// duplicated coordinates). That is, it generates the code: | /// duplicated coordinates). That is, it generates the code: | ||||
/// | /// | ||||
/// crd = coordinates_tid_lvl[pos] | /// crd = coordinates_tid_lvl[pos] | ||||
/// while (pos < pHi && coordinates_tid_lvl[pos] == crd) | /// while (pos < pHi && coordinates_tid_lvl[pos] == crd) | ||||
Show All 37 Lines | private: | ||||
/// Emits extra locals, since the locals might not be in simplified lattices | /// Emits extra locals, since the locals might not be in simplified lattices | ||||
/// point used to generate the loops, but are still required to generate | /// point used to generate the loops, but are still required to generate | ||||
/// expressions. | /// expressions. | ||||
void emitExtraLocalsForTensorsAtDenseLvls(OpBuilder &builder, Location loc, | void emitExtraLocalsForTensorsAtDenseLvls(OpBuilder &builder, Location loc, | ||||
ArrayRef<TensorId> tids, | ArrayRef<TensorId> tids, | ||||
ArrayRef<Level> lvls); | ArrayRef<Level> lvls); | ||||
/// Emits a for loop to iterate over a dense level, or a sparse level that has | |||||
/// not been sliced. | |||||
This should be Level lvl (I'm pretty sure) wrengr: This should be `Level lvl` (I'm pretty sure) | |||||
Operation *emitForLoopOverTensorAtLvl(OpBuilder &builder, Location loc, | |||||
TensorId tid, Level lvl, | |||||
MutableArrayRef<Value> reduc, | |||||
bool isParallel); | |||||
/// Emits a while loop to iterate over a sparse level that has been sliced. | |||||
/// Inserts break statement when the coordinate exceeds the sliceSize; | |||||
is exceeds -> exceeds but more importantly, I would state this, We break out of the loop when the coordindate exceeds the slideSize. aartbik: is exceeds -> exceeds
but more importantly, I would state this,
We break out of the loop when… | |||||
/// The method sets the insertion point inside the generated while loop body | |||||
/// after the break statement before return (so that callers need to handle | |||||
/// only in-bound coordinates). | |||||
Operation *emitWhileLoopOverSliceAtSparseLvl(OpBuilder &builder, Location loc, | |||||
Value pLo, Value pHi, | |||||
Value offset, Value sliceSize, | |||||
TensorId tid, Level lvl, | |||||
MutableArrayRef<Value> reduc); | |||||
/// Exits a for loop, returns the reduction results, e.g., | /// Exits a for loop, returns the reduction results, e.g., | ||||
/// For sequential for loops: | /// For sequential for loops: | ||||
/// %ret = for () { | /// %ret = for () { | ||||
/// ... | /// ... | ||||
/// %val = addi %args, %c | /// %val = addi %args, %c | ||||
/// yield %val | /// yield %val | ||||
/// } | /// } | ||||
/// For parallel loops, the following generated code by users: | /// For parallel loops, the following generated code by users: | ||||
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines | if (const auto reassoc = collapseReassoc[tid]) { | ||||
llvm::map_range(srcLvls, [&](Attribute srcLvl) -> Level { | llvm::map_range(srcLvls, [&](Attribute srcLvl) -> Level { | ||||
// TODO: replace this with the converter for `LevelAttr`. | // TODO: replace this with the converter for `LevelAttr`. | ||||
return srcLvl.cast<IntegerAttr>().getValue().getZExtValue(); | return srcLvl.cast<IntegerAttr>().getValue().getZExtValue(); | ||||
})); | })); | ||||
} | } | ||||
return {dstLvl}; | return {dstLvl}; | ||||
} | } | ||||
// | |||||
// Slice-driven loop related methods. | |||||
// | |||||
/// Retrieves the most recent slice on lvl. To reduce affine expression like | |||||
the most recent slice (singular) aartbik: the most recent slice (singular) | |||||
/// d0 + d1 + d2, we need two slices (one of size d1 + d2, and the other of | |||||
/// size d2). This methods returns the latter slice (of size d2), which is | |||||
This description doesn't make sense to me. Do you have a design doc that explains what exactly you mean by "slice" in this context, and explains how/why "reducing d0+d1+d2" translates into needing the two slices you mention? wrengr: This description doesn't make sense to me. Do you have a design doc that explains what exactly… | |||||
Yeah, I am writing a paper on this (but still at very early stage), I will share it with you later when it is more or less complete. Peiming: Yeah, I am writing a paper on this (but still at very early stage), I will share it with you… | |||||
/// also the final slice on the level. | |||||
const SliceInfo &getFinalSliceOnLvl(TensorId tid, Level lvl); | |||||
const ref? aartbik: const ref? | |||||
/// Get the remaining number of constraints needed to fully *resolve* | |||||
Either "number of constraints needed to..." or "number of constraints that are needed to..." wrengr: Either "number of constraints needed to..." or "number of constraints that are needed to..." | |||||
/// dependent levels on tensor[tid]. | |||||
"level" wrengr: "level" | |||||
unsigned remDepOnLevel(TensorId tid, Level lvl) const; | |||||
perhaps we should discuss somewhere else, but we use "unsigned" at most places, and size_t only for local operations, or inside casts and asserts aartbik: perhaps we should discuss somewhere else, but we use "unsigned" at most places, and size_t only… | |||||
/// Whether the tid, lvl is fully *reduced*, i.e., the non-trivial index | |||||
/// expression has been reduced to a trivial one. | |||||
/// E.g., A[i + j] => A[i + 2] (j is reduced) | |||||
That should be "A[i+j] => A[i+2]" to make it clear that it dereives from "j => 2". wrengr: That should be "A[i+j] => A[i+2]" to make it clear that it dereives from "j => 2". | |||||
bool depFullyReduced(TensorId tid, Level lvl) const { | |||||
return remDepOnLevel(tid, lvl) == 1; | |||||
} | |||||
/// Whether the tid, lvl is fully resolved, i.e., we entered the level already | |||||
/// (the index on that level is determined). | |||||
/// E.g., A[i + j] => A[2 + 3] (both i and j become invariants for inner | |||||
/// loops). | |||||
bool lvlFullyResolved(TensorId tid, Level lvl) const { | |||||
return remDepOnLevel(tid, lvl) == 0; | |||||
} | |||||
/// Generates a whileOp to iterate over a subset of coordinates on tid on lvl | |||||
/// using the pHi and pLo provided, the loop break on the first coordinate | |||||
/// that exceeds the slice boundary (i.e., coord >= slice.offset + | |||||
/// slice.size). | |||||
std::pair<Operation *, ValueRange> | |||||
genSliceLvlTraverseLoop(OpBuilder &builder, Location loc, Value pLo, | |||||
Value pHi, Value offset, Value size, TensorId tid, | |||||
Should this be LoopOrd? If not, then what is it? wrengr: Should this be `LoopOrd`? If not, then what is it? | |||||
This is unsigned index to another array (not the loop sequence). I will stick with this. Peiming: This is `unsigned index` to another array (not the loop sequence). I will stick with this. | |||||
Level lvl, ValueRange userReduc, bool genYield, | |||||
/*bodyBuilder=*/ | |||||
llvm::function_ref<void(OpBuilder &, Location, Value, | |||||
MutableArrayRef<Value>)>); | |||||
/// Generates a nested loop that iterates over tid on all the coordinates on | |||||
/// lvl. | |||||
ValueRange genUnResolvedSliceTreeTraverse( | |||||
OpBuilder &builder, Location loc, Value offset, TensorId tid, Level lvl, | |||||
size_t depth, ValueRange userReduc, | |||||
/*bodyBody=*/ | |||||
llvm::function_ref<void(OpBuilder &, Location, Value, | |||||
MutableArrayRef<Value>)>); | |||||
/// Generates code to get the first non-empty slice of tid on lvl, when all | |||||
/// the previous level before `lvl` are resolved (or lvl is the first level). | |||||
/// | |||||
/// This is the simple case because the previous level are resolved into a | |||||
/// single node in the storage tree. | |||||
void genResolvedSliceBegin(OpBuilder &builder, Location loc, TensorId tid, | |||||
Level lvl); | |||||
What is this supposed to be: LoopOrd, LoopId, other? wrengr: What is this supposed to be: `LoopOrd`, `LoopId`, other? | |||||
This is an unsigned counter. Peiming: This is an `unsigned` counter. | |||||
/// Generates code to get the first non-empty slice of tid on lvl, when | |||||
/// the previous levels before `lvl` are unresolved | |||||
/// | |||||
/// This is the complex case because the previous levels corresponding to a | |||||
/// range of nodes in the storage tree. | |||||
void genUnResolvedSliceBegin(OpBuilder &builder, Location loc, TensorId tid, | |||||
Level lvl); | |||||
/// Generates code to get the first non-empty slice of tid on lvl. | |||||
/// return true if has already been resolved. | |||||
bool genSliceBegin(OpBuilder &builder, Location loc, TensorId tid, Level lvl); | |||||
/// Generates code to get the next non-empty slices of tid on lvl. | |||||
void genSliceNextInduction(OpBuilder &builder, Location loc, | |||||
const Operation *whileOp, TensorId tid, Level lvl, | |||||
SmallVectorImpl<Value> &operands, | |||||
unsigned &retIdx); | |||||
/// Generates a slice-driven while loop as follows. | |||||
as follows? aartbik: as follows? | |||||
/// | |||||
/// curSlice = getFirstNonEmptySlice(tensor). | |||||
/// | |||||
/// while(isNonEmpty) { | |||||
/// ..user code.. | |||||
/// isNonEmpty, curSlice = getNextNonEmptySlice(curSlice) | |||||
/// } | |||||
Operation *emitSliceDrivenLoopOverTensorAtLvl(OpBuilder &builder, | |||||
Location loc, TensorId tid, | |||||
Level lvl, | |||||
MutableArrayRef<Value> reduc); | |||||
/// A optional string attribute that should be attached to the loop | /// A optional string attribute that should be attached to the loop | ||||
/// generated by loop emitter, it might help following passes to identify | /// generated by loop emitter, it might help following passes to identify | ||||
/// loops that operates on sparse tensors more easily. | /// loops that operates on sparse tensors more easily. | ||||
StringAttr loopTag; | StringAttr loopTag; | ||||
/// Whether the loop emitter needs to treat the last tensor as the output | /// Whether the loop emitter needs to treat the last tensor as the output | ||||
/// tensor. | /// tensor. | ||||
bool hasOutput; | bool hasOutput; | ||||
bool isSparseOut; | bool isSparseOut; | ||||
/// The insertion point to allocate top level local variables. | |||||
period at end. "to allocate top level local" makes very little sense when read in isolation. Just say what code fragment this points to aartbik: period at end.
"to allocate top level local"
makes very little sense when read in isolation. | |||||
Operation *localInsertPos; | |||||
// | // | ||||
// Fields which have `numTensor` many entries. | // Fields which have `numTensor` many entries. | ||||
// | // | ||||
// TODO: switch to an AOS style to avoid any possible mismatches. | // TODO: switch to an AOS style to avoid any possible mismatches. | ||||
// | // | ||||
/// Input and (optional) output tensors. | /// Input and (optional) output tensors. | ||||
std::vector<Value> tensors; | std::vector<Value> tensors; | ||||
Show All 19 Lines | private: | ||||
// The segment upper bound for non-uniques level after de-duplication. | // The segment upper bound for non-uniques level after de-duplication. | ||||
std::vector<std::vector<Value>> segHi; | std::vector<std::vector<Value>> segHi; | ||||
std::vector<std::vector<Value>> highs; | std::vector<std::vector<Value>> highs; | ||||
std::vector<std::vector<Value>> lvlSizes; | std::vector<std::vector<Value>> lvlSizes; | ||||
std::vector<std::vector<Value>> positionsBuffers; // to_positions | std::vector<std::vector<Value>> positionsBuffers; // to_positions | ||||
std::vector<std::vector<Value>> coordinatesBuffers; // to_coordinates | std::vector<std::vector<Value>> coordinatesBuffers; // to_coordinates | ||||
std::vector<Value> valBuffer; // to_value | std::vector<Value> valBuffer; // to_value | ||||
// | |||||
// Slice-driven loops related fields. | |||||
// | |||||
/// Whether the sparse input is a slice. | /// Whether the sparse input is a slice. | ||||
std::vector<bool> isSparseSlices; | std::vector<bool> isSparseSlices; | ||||
/// Values related to slices. | /// Values related to slices. | ||||
std::vector<std::vector<Value>> sliceOffsets; | std::vector<std::vector<Value>> sliceOffsets; | ||||
std::vector<std::vector<Value>> sliceStrides; | std::vector<std::vector<Value>> sliceStrides; | ||||
// Map from [tid, level] to a list of dependent [tid, level]. | // Map from [tid, level] to a list of dependent [tid, level]. | ||||
// See comments for `DependentDimGetter`. | // See comments for `DependentDimGetter`. | ||||
std::vector<std::vector<std::vector<std::pair<TensorId, Level>>>> | std::vector<std::vector<std::vector<std::pair<TensorId, Level>>>> | ||||
dependentLvlMap; | dependentLvlMap; | ||||
// The cached position buffer for the slices, they serve the same purpose as | |||||
// ptrBuffer for compressed dimensions. | |||||
// But they always starts with the first pidx pointing to coord > slice.offset | |||||
// to avoid iteration from the beginning. | |||||
std::vector<std::vector<std::vector<Value>>> slicePosBuffer; | |||||
// The cached size for each slices. | |||||
std::vector<std::vector<std::vector<Value>>> sliceSizes; | |||||
// The number of reduced dependencies on a tensor level so far. | |||||
std::vector<std::vector<unsigned>> levelReducedDep; | |||||
// sliceStack[tid] holds the generated slice stack on tid. | |||||
std::vector<std::vector<SliceInfo>> sliceStack; | |||||
// | // | ||||
// View based reshape related-fields and methods | // View based reshape related-fields and methods | ||||
// | // | ||||
/// Collapse Reassociations related to a specific tensor | /// Collapse Reassociations related to a specific tensor | ||||
// TODO: support expand. | // TODO: support expand. | ||||
std::vector<ArrayAttr> collapseReassoc; | std::vector<ArrayAttr> collapseReassoc; | ||||
/// TODO: not yet used, it should track the current level for each tensor | /// TODO: not yet used, it should track the current level for each tensor | ||||
/// to help eliminate `lvls` paramters from above APIs. | /// to help eliminate `lvls` paramters from above APIs. | ||||
/// std::vector<Level> curLvl; | /// std::vector<Level> curLvl; | ||||
// | // | ||||
// Fields which have at most `numLoops` many entries. | // Fields which have at most `numLoops` many entries. | ||||
// | // | ||||
/// Loop Stack, stores the information of all the nested loops that are | /// Loop Stack, stores the information of all the nested loops that are | ||||
/// alive. | /// alive. | ||||
std::vector<LoopInfo> loopStack; | std::vector<LoopInfo> loopStack; | ||||
/// Loop Sequence Stack, stores the universal index for the current loop | // Loop Sequence Stack, stores the unversial index for the current loop | ||||
/// sequence. | // sequence. and a list of tids which was taken sliced. | ||||
std::vector<Value> loopSeqStack; | // TODO: maybe we should have a LoopSeqInfo | ||||
std::vector<std::pair<Value, std::vector<std::tuple<TensorId, Level, bool>>>> | |||||
loopSeqStack; | |||||
/// Maps `LoopId` (used by `AffineDimExpr`) to `LoopOrd` (in the `loopStack`). | /// Maps `LoopId` (used by `AffineDimExpr`) to `LoopOrd` (in the `loopStack`). | ||||
/// TODO: We should probably use a callback function here to make it more | /// TODO: We should probably use a callback function here to make it more | ||||
/// general. | /// general. | ||||
std::vector<LoopOrd> loopIdToOrd; | std::vector<LoopOrd> loopIdToOrd; | ||||
}; | }; | ||||
} // namespace sparse_tensor | } // namespace sparse_tensor | ||||
} // namespace mlir | } // namespace mlir | ||||
#endif // MLIR_DIALECT_SPARSETENSOR_TRANSFORMS_SPARSETENSORLOOPEMITTER_H_ | #endif // MLIR_DIALECT_SPARSETENSOR_TRANSFORMS_SPARSETENSORLOOPEMITTER_H_ |
comments with should or must are a bit dangling unless you say what happens when this assumption is not true