This patch introduces the initial bits to support vector masking
using the vector.mask operation. Vectorization changes should be
NFC for non-masked cases. We can't test masked cases directly until
we extend the Transform dialect to support masking.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp | ||
---|---|---|
1809 | Hey @nicolasvasilache, do you think you could help extending the Transform dialect so that we can provide the vector sizes for masked dims? |
Fixing a couple of issues when no vector sizes for masked dimensions are not provided
mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp | ||
---|---|---|
1809 | Happy to! Could you temporarily hardcode some size in there and add some test IR so I can see what I should expect? This will likely require a new transform op that is not a blanket "vectorize the world" so that we can pass the information you want at a finer granularity. This will likely need some iteration to get to a reasonably scalable usage. Left some other review comments in the meantime. | |
mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp | ||
73 | super-nit: can we make the line spacing uniform between methods(here) and members(below)? | |
108 | plz avoid unsigned everywhere, we know by now this is not meant for expressing >=0 but should really be used for bit-twiddling or when we really really nerd the extra bit. | |
108 | Should this be a (better named) helper on the LinalgOp interface? The name makes it hard to understand what it does and we should be doing any such manipulation in a very localized place in LinalgOp. | |
136 | unsigned purge here and everywhere plz | |
153 | Can we call these staticUpperBounds everywhere? This seems easier to me to relate to what we're looking to do instead of vecSizesForMaskedDims and extractDynamicVectorDimValues. | |
157 | How about early exit here ? if (!linalgOp.hasDynamicShape()) { canonicalVecShape = linalgOp.getStaticLoopRanges(); return success(); } I don't think you need the checks and debugs after that in the static case? | |
165 | Seems fishy, what happens in this case ? I'd expect this to be not fail gracefully .. Edit: ah scratch that, I see that this is just after the precondition, can we make it part of the precondition? | |
170 | Can we sprinkle a few precompute prefixes in some of these APIs to make it clear what happens at init time? | |
177 | LLVM_DEBUG(llvm::interleaveComma(canonicalVecShape, llvm::dbgs() << ...)); | |
247 | pass RewriterBase here and everywhere possible post https://reviews.llvm.org/D137922 plz | |
272 | Plz use updateRootInPlace once RewriterBase is piped through. | |
810 | nit: can we spell this as: // 3.a. Convert the indexing map for this input/output to a transfer read... ... /// 3.a.i For input reads we use the canonical vector shape. if (linalgOp.isDpsInput(opOperand)) ... } else { /// 3.a.ii For output reads (iteration-carried dependence, e.g., reductions) ... // 3.b. If masked, set in-bounds to true. ... // 3.c. Not all ops support 0-d vectors, | |
1044 | Can we make this init the state as part of the precondition? | |
mlir/lib/IR/AffineMap.cpp | ||
344 | Fails when called on a non-projected-permutation. is misleading here. It expects a projected permutation otherwise it crashes. | |
347 | Better name and doc please, this is much too confusing. In fact I think you can just do something like |
- Addressed feedback
- New changes around vectorization initState and canonical vector shape computation
mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp | ||
---|---|---|
108 |
Sorry,dim is a misnomer as it refers to the dimension position, not the size. It has to be unsigned as that's what it's expected by the AffineMap::getXXXYYYPosition(). Moved it to LinalgOp interface. I can't think of a better name for the utility... It's just mapping an iteration space dimension to a dimension of an operand... Any other suggestion? I'm happy to replace with with any other existing utility but I couldn't find any. | |
136 | As explained before, renamed to indicate that it's the position, not the size. | |
153 | I had already renamed this locally to inputVectorSizes and change a bit the meaning. The input sizes are now taken into account to compute the canonical vector shape and if they are also provided for static shapes they should match the size of the static shapes. We are passing them all now to simplify the client API, including the transform dialect, as it's easier to provide all the vector sizes than having to filter out the static ones. Let me know if that works. | |
157 | This is gone now. This code has changed a bit in the last version. | |
165 | Let me know if it makes more sense in the new version, where the inputVectorSizes, if provided, should match the linalgop.getNumLoops(). Otherwise, this would be a bug. | |
170 | Much better! | |
177 | Ah! I didn't know this utility! It's been such a pain to always print SmallVector's... | |
247 | I think we are going in the opposite direction based on the review comments? | |
1044 | Probably better to separate the concerns. I think even part of the precondition checks are reused outside of the vectorizer. A public interface was introduced recently. | |
mlir/lib/IR/AffineMap.cpp | ||
347 | I had renamed this like 10 times. It's a difficult name. Hopefully it's better now :). |
mlir/lib/IR/AffineMap.cpp | ||
---|---|---|
347 | Wait .. what do I see just above .. literally the same functionality modulo an assert .. Can we just have a single Optional<int64_t> AffineMap::getResultPosition(AffineExpr e) const { for (int64_t i = 0, numResults = getNumResults(); i < numResults; i++) if (getResult(i) == e) return i; return llvm::None; } and let clients do the assertions they want ? It seems very counterproductive to have all these special case functions with slightly varying assertions and hard to grok names .. |
Addressed feedback + minor fixes.
Please ignore the AffineMap utility. It will be removed after rebasing on top of D138946.
I do not understand the implication of
// TODO: We mask the transfer.transfer_write here because this op is // special-cased. A linalg.yield may produced multiple vector.transfer_write // ops and can't be mapped using BlockAndValueMapping. AffineMap opOperandMap = linalgOp.getMatchingIndexingMap(opOperand); write = state.maskOperation(b, write, linalgOp, opOperandMap);
.. also I am not seeing any test changes, so it seems you are adding a lot of code that is not tested and not activated ?
mlir/include/mlir/IR/AffineMap.h | ||
---|---|---|
178 | I think this goes away with the rebase, just flagging for removal so we don't forget. | |
mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp | ||
68 | Why not make this a ctor? | |
247 | We cannot have code like this: Operation *maskOpTerminator = &maskOp.getMaskRegion().front().back(); for (auto &en : llvm::enumerate(opToMask->getResults())) en.value().replaceAllUsesExcept(maskOp.getResult(en.index()), maskOpTerminator); it must use a RewriterBase with updateRootInPlace | |
283 | Hmm .. what's the contract between this createRegionMask lambda and the insertion points during I've seen too much ugly stuff re. insertion points leaking across function call boundaries. Let's add an OpBuilder::InsertionGuard at the top of this function. | |
299–300 | this must use a RewriterBase with updateRootInPlace | |
543 | nit: produce |
.. also I am not seeing any test changes, so it seems you are adding a lot of code that is not tested and not activated ?
Changes in the overall vectorization algorithm are tested with existing vectorization tests. This patch is NFC for those. Masking is not enabled if inputVectorSizes are not provided. If they are provided, only elementwise ops without reductions and fully dynamic shapes are vectorized.
I can't add unit tests until the new operation for masked vectorization is added to the transform dialect, as we no longer have the vectorizer testing pass. However, this PR has been extensively tested in IREE, both with and without masking for even more cases than the currently supported right now.
Waiting on the new transform dialect op to land to add more tests.
mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp | ||
---|---|---|
68 | Because it can fail, at least for now. This will change once we support more cases with masking. Then, we could assert and turn it into a constructor. | |
247 | I can use updateRootInPlace when we have a rewriter here but I can't do any replacement method because opToMask is moved inside the mask region, not replaced. | |
283 | Added guard. This is a simple lambda to create the op region. We follow the same approach for scf.if and other region ops. The only contract is that the region needs to have a vector::YieldOp, which is described in the vector.mask doc. | |
299–300 | Added TODO until we have a rewriter. | |
543 |
Good point. This is a comment for an old problem. I removed it and moved this code to buildVectorWrite. | |
mlir/lib/IR/AffineMap.cpp | ||
347 | Extracted this to https://reviews.llvm.org/D138946 |
Waiting on the new transform dialect op to land to add more tests.
Thanks for integrating it and adding tests, the testing part LGTM, till need to make another pass on the last version of the code.
Thanks @dcaballe !
mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td | ||
---|---|---|
1120 | "definite failure" | |
1138 | can you add a TODO: applyToOne plz ? | |
mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp | ||
222 | Use the find API and the iterator you get from it to avoid multi-lookups | |
263 | nit: uppermension :) ? | |
490 | can you add a TODO to tighten op semantics so that we don't mix inbounds and mask since this is well defined? | |
826 | I'll need to revisit all this in light of the broadcast separation. | |
972 | ok as a first appox. |
@dcaballe, I was looking at this PR as I was doing some spelunking and I realize we are not testing the case of the SSA value as well as the error case when the SSA value is not a constant.
Can we please add the missing tests in a a followup ?
"definite failure"