Page MenuHomePhabricator

[mlir][Vector] Enable create_mask for scalable vectors
ClosedPublic

Authored by jsetoain on Jan 26 2022, 7:08 AM.

Details

Summary

The way vector.create_mask is currently lowered is
vector-length-dependent, and therefore incompatible with scalable vector
types. This patch adds an alternative lowering path for create_mask
operations that return a scalable vector mask.

Diff Detail

Event Timeline

jsetoain created this revision.Jan 26 2022, 7:08 AM
jsetoain requested review of this revision.Jan 26 2022, 7:08 AM

Please block the patch for now, this is just something I'm thinking about, I'm also opening a post in discourse to get some feedback.

jsetoain updated this revision to Diff 404962.Feb 1 2022, 9:07 AM

Fix create mask folder. Move create_mask on scalable vectors from
VectorTransforms to ConvertVectorToLLVM

jsetoain updated this revision to Diff 407105.Feb 9 2022, 2:54 AM

Switching to llvm.intr.experimental.stepvector for lowering. This fixes
consistency issues with fixed-length create_mask operations.

c-rhodes added inline comments.Feb 9 2022, 7:30 AM
mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
919–920

this optimisation seems odd, I can't imagine there's any hardware out there with vectors approaching 2^64 elements (and 2^32 for that matter). Can this option be removed and always default i32?

923

getScalableVectorType?

928

I know this is based on the fixed lowering, but I wonder if this should be ult.

jsetoain marked 3 inline comments as done.Feb 9 2022, 9:11 AM
jsetoain added inline comments.
mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
919–920

The default is i32 (i.e.: optimize indices), but the fixed-length create_mask operation supports i64 indices so I believe it's only fair, for consistency, to support them for scalable vectors as well. Is it worth removing the option altogether? probably, but that should be a different patch :-)

923

It's broken and I'm not sure how to fix it. It's in the backlog.

928

If you do ult you have the same "wrap around" problem. It has to be signed in case the index is negative. There's a discussion here: [[ https://reviews.llvm.org/D116069 | [mlir][vector] Allow values outside of [0; dim-size] in create_mask]] about why this is the preferred behavior.

jsetoain updated this revision to Diff 407184.Feb 9 2022, 9:21 AM
jsetoain marked 3 inline comments as done.

Rebase on main

Does this need a round-trip test in mlir/test/Dialect/Vector/ops.mlir?

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
919–920

The default is i32 (i.e.: optimize indices), but the fixed-length create_mask operation supports i64 indices so I believe it's only fair, for consistency, to support them for scalable vectors as well. Is it worth removing the option altogether? probably, but that should be a different patch :-)

I agree, better to be consistent and remove in another patch. Just an observation :)

928

If you do ult you have the same "wrap around" problem. It has to be signed in case the index is negative. There's a discussion here: [[ https://reviews.llvm.org/D116069 | [mlir][vector] Allow values outside of [0; dim-size] in create_mask]] about why this is the preferred behavior.

the vector.create_mask -> vector.constant_mask canonicalization for negative values should happen before this lowering?

jsetoain updated this revision to Diff 409701.Feb 17 2022, 9:56 AM
jsetoain marked an inline comment as done.

Accept particular cases of scalable constant masks.

jsetoain marked an inline comment as done.Feb 17 2022, 10:14 AM
jsetoain added inline comments.
mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
928

I've included the right flow and constrains for create_mask -> constant_mask canonicalization.

jsetoain marked an inline comment as done.Feb 17 2022, 10:15 AM
c-rhodes added inline comments.Feb 21 2022, 5:43 AM
mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h
94 ↗(On Diff #409701)

nit: an

mlir/lib/Dialect/Vector/IR/VectorOps.cpp
4242

== 0? Or to be 0?

https://mlir.llvm.org/docs/Dialects/Vector/#vectorconstant_mask-mlirvectorconstantmaskop

Each value of ‘mask_dim_sizes’ must be non-negative and not greater than the size of the corresponding vector dimension (as opposed to vector.create_mask which allows this).

mlir/test/Dialect/Vector/ops.mlir
375

nit: unrelated change

jsetoain updated this revision to Diff 410314.Feb 21 2022, 8:38 AM

Address comments

jsetoain marked 3 inline comments as done.Feb 21 2022, 8:39 AM

LGTM, but I'm pretty new around these parts so I'll leave it to another reviewer to accept

@ftynse Hi Alex, I just wanted to kindly remind you about this patch, after the discussion in discourse nobody else seems to have anything against it, and it's currently blocking a stack of approved patches. Thank you!

Herald added a project: Restricted Project. · View Herald TranscriptMar 4 2022, 8:39 AM

LGTM, let's wait for Alex.
Few remaining nits

mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
1755

period at end

mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
607

period at end

mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
1466

although this will match

genbool_1d_scalable

is a better LABEL!

jsetoain updated this revision to Diff 414038.Mar 9 2022, 2:01 AM
jsetoain marked 3 inline comments as done.

Address reviewer comments

Thanks for the review, Aart!

Thanks for the contribution, Javier! Some comments inline.

mlir/include/mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h
69

Add doc about what indexOptimizations is actually enabling?

mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h
99 ↗(On Diff #414038)

Now that we are making this utility public, I wonder if it would make more sense to move it somewhere else since it's not a vector specific transformation, right? Maybe we could move it to the builder class? It looks like a builder method to me.

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
905

Would it make sense to align this lowering with the non-scalable version (i.e., moving it to Vector Transforms)? I think create_mask is a relatively high level op that would make sense to lower to something simpler before we lower it to LLVM. That would align with the approach that we follow for similar vector ops and would make the LLVM lowering simpler (which is complex already). (Hopefully I'm not missing any context. Feel free to ignore this comment you already discussed this).

919–920

+1 to removing this. It's a bit odd that we shrink the index type as part of the lowering. I think that kind of transformation should happen as a separate step before the lowering to LLVM.

jsetoain updated this revision to Diff 417883.Mar 24 2022, 4:36 AM
jsetoain marked 3 inline comments as done.

Address reviewer comments

mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h
99 ↗(On Diff #414038)

That creates an awkward dependency between common MLIR and the Arithmetic dialect, but it does make a lot of sense to move this to Arithmetic Utils (I've also renamed it to match similar utilities).

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
905

The reason to separate them was that the scalable lowering depends on LLVM IR much earlier than the fixed-length version. In a way, create_mask for scalable vectors is a lower level operations than create_mask for fixed-length vectors. If we move scalable create_mask to Vector Transforms, we add an additional dependency there. If we want to unify both lowerings, I think it would make more sense to move create_mask to conversions. There was some discussion in the thread I created for this, and the conclusion was that having two lowerings for one operation was not an issue. If that not the case any more or you think it doesn't apply to this case, I'm happy to move things around. It's a trivial change :-)

ftynse accepted this revision.Mar 24 2022, 6:54 AM
ftynse added inline comments.
mlir/include/mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h
66

Why not call it assume32BitIndices then?

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
911

Nit: no need for explicit mlir:: here.

mlir/lib/Dialect/Vector/IR/VectorOps.cpp
4286–4287

Use matchPattern(m_constantInt(...)) instead of explicitly matching for arith.constant here.

This revision is now accepted and ready to land.Mar 24 2022, 6:54 AM
ftynse added inline comments.Mar 24 2022, 6:57 AM
mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
919–920

Do you all know that the bitwidth of index is parameterized through the data layout mechanism? You can also define this as a full-fledged ConversionPattern, get hold of a TypeConverter instance and make it convert the actual IndexType to get you the integer type of the expected bitwidth. Anything else will run into type mismatches sooner or later.

jsetoain updated this revision to Diff 417939.Mar 24 2022, 8:33 AM
jsetoain marked 3 inline comments as done.

Address reviewer comments

jsetoain added inline comments.Mar 24 2022, 8:35 AM
mlir/include/mlir/Conversion/VectorToLLVM/ConvertVectorToLLVM.h
66

The only reason is that it is the name used elsewhere, but I agree it's not a great name. Once this has landed, I will push a NFC patch changing the name everywhere, probably to something like assume32BitVectorIndices to avoid confusion.

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
919–920

I believe this is handled independently from that mechanism because you might want to have IndexType to i64 conversion, and yet generate i32 indices in this very particular case. While your loop indices might go well beyond 2^32, the length of your physical vector is unlikely to do so.

dcaballe accepted this revision.Mar 24 2022, 12:01 PM
dcaballe added inline comments.
mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
905

If it was discussed already, it's fine then! We can align both lowerings in the future. We may have a better picture once we introduce further masking support.

The reason to separate them was that the scalable lowering depends on LLVM IR

What is missing on the MLIR side to be able to do the lowering? StepVectorOp?

jsetoain marked 2 inline comments as done.Mar 24 2022, 12:48 PM
jsetoain added inline comments.
mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
905

StepVectorOp would unify the lowering, but, unless we find more use cases, I'm not sure its worth the change. The documentation of llvm.experimental.stepvector says that the recommended way to do this for fixed-length vectors is by using a constant vector. This would lead to a unification on the lowering of create_mask, but then a dual lowering for StepVectorOp (to the intrinsic or the constant vector, depending on scalability of result). I'd say it's just marginally better, if that. If as we expand mask creation and manipulation in MLIR we find this op is useful, I'll take care of unifying the lowering of this op.

jsetoain marked 2 inline comments as done.Mar 25 2022, 2:34 AM
This revision was automatically updated to reflect the committed changes.