This is an archive of the discontinued LLVM Phabricator instance.

Add vector.scan op
ClosedPublic

Authored by harsh on Jan 12 2022, 4:02 PM.

Details

Summary

This patch adds the vector.scan op which computes the
scan for a given n-d vector. It requires specifying the operator,
the identity element and whether the scan is inclusive or
exclusive.

TEST: Added test in ops.mlir

Diff Detail

Event Timeline

harsh created this revision.Jan 12 2022, 4:02 PM
harsh requested review of this revision.Jan 12 2022, 4:02 PM

Some fly by comments. Can you say a bit more on the motivation for adding this op?
Will you also provide a lowering to LLVM IR?

mlir/lib/Dialect/Vector/VectorOps.cpp
4276

since you have a custom verifier, this revision also needs an invalid.mlir case that exercises all custom verification errors

4276

period at end

harsh added a comment.Jan 14 2022, 3:32 PM

Hi Aart,

Thanks for taking a look at this patch. Let me provide some motivation for this op. I have been working closely with the IREE folks on try to implement a scan operation that can run on the
cpu and gpu. Since scan doesn't fit into the linalg structured op framework, we decided to prototype this in a linalg_ext dialect. This dialect currently resides in iree but the plan is
to upstream this to mlir-core. Here is the op definition https://github.com/google/iree/blob/main/llvm-external-projects/iree-dialects/include/iree-dialects/Dialect/LinalgExt/IR/LinalgExtOps.td#L250
though I plan to update that and this patch based on our recent findings. Having defined this op, we needed to find a way to lower this to the gpu dialect and thought that the jump
from the linalg_ext abstraction to the gpu abstraction would be too large. Thus, we decided on creating a vector.scan op which would provide a more progressive lowering. The plan is to
first add a lowering to the gpu dialect (where the scan would be lowered to gpu.shuffle ops) that is performant and then provide a cpu lowering. Would love to hear any thoughts
you have on this and any ideas on how we could do this better.

Thanks,
Harsh

Having defined this op, we needed to find a way to lower this to the gpu dialect and thought that the jump
from the linalg_ext abstraction to the gpu abstraction would be too large. Thus, we decided on creating a vector.scan op which would provide a more progressive lowering. The plan is to
first add a lowering to the gpu dialect (where the scan would be lowered to gpu.shuffle ops) that is performant and then provide a cpu lowering. Would love to hear any thoughts
you have on this and any ideas on how we could do this better.

For most vector ops, we have an actual lowering to LLVM IR, just so we can run integration test on the new functionality. This also acts as extra "documentation" on what the op does.
This first lowering does not need to be super efficient, but if you see chance to add the lowering to this or a follow up CL, that would make the intended use of this op more clear to me.

Having defined this op, we needed to find a way to lower this to the gpu dialect and thought that the jump
from the linalg_ext abstraction to the gpu abstraction would be too large. Thus, we decided on creating a vector.scan op which would provide a more progressive lowering. The plan is to
first add a lowering to the gpu dialect (where the scan would be lowered to gpu.shuffle ops) that is performant and then provide a cpu lowering. Would love to hear any thoughts
you have on this and any ideas on how we could do this better.

For most vector ops, we have an actual lowering to LLVM IR, just so we can run integration test on the new functionality. This also acts as extra "documentation" on what the op does.
This first lowering does not need to be super efficient, but if you see chance to add the lowering to this or a follow up CL, that would make the intended use of this op more clear to me.

Sounds good. I will add an initial lowering to LLVM IR / Standard Dialect that should showcase the use of this op in the subsequent patch. Thanks!

mlir/lib/Dialect/Vector/VectorOps.cpp
4276

thanks will add an invalid.mlir

aartbik added inline comments.Jan 26 2022, 6:37 PM
mlir/lib/Dialect/Vector/VectorOps.cpp
4276

is this still TBD?

harsh updated this revision to Diff 403523.Jan 26 2022, 11:59 PM

Added lowering to arith ops, tests and integration tests.

@aartbik - based on discussions with @ThomasRaoux , we felt it made more sense to do a vector to vector conversion instead of going directly to llvm. I have added the patch here along with some tests. Please take a look and let me know what you think. Thanks!

mlir/lib/Dialect/Vector/VectorOps.cpp
4276

I have just uploaded a new patch. Thanks!

ThomasRaoux requested changes to this revision.Jan 27 2022, 10:29 AM

Thanks for adding an integration test! Few more comments.

mlir/include/mlir/Dialect/Vector/VectorOps.td
2438

vector source and result need to have the exact same type?

2460

source and destination types should be the same? It would be good to have them only once and add it as restriction to make the op look simpler.

2461

The IR doesn't look correct, should it be %1#1?

mlir/lib/Dialect/Vector/VectorOps.cpp
4278

you should be able to specify that in tablegen? (related to the comments there)

4286

has to be greater or equal than?

mlir/lib/Dialect/Vector/VectorTransforms.cpp
2382–2421

Move this function outside of the struct and add some documentation comment on top.

2423

Value can be nullptr so you don't need the Optional here.

Also as above move this function outside of the struct.

2453

nit, use: elType.isIntOrIndex()

2468

you can just initialize the offsets directly:
SmallVector<int64_t> offsets(dstRank, 0);
Same for sizes.

2505–2506

We should check precondition earlier not after generating a bunch of IR.

2525

remove

This revision now requires changes to proceed.Jan 27 2022, 10:29 AM
harsh marked 8 inline comments as done.Jan 27 2022, 4:33 PM
harsh added inline comments.
mlir/include/mlir/Dialect/Vector/VectorOps.td
2438

Yes thanks! I changed it to use TCresIsSameAsOpBase

2460

I added the constraint but couldn't make the assemblyFormat of the op simpler. If I remove type($dest) I get

error: type of result #0, named 'dest', is not buildable and a buildable type cannot be inferred
$kind `,` $source `,` $initial_value attr-dict `:` `(` type($source) `,` type($initial_value) `)` `to` `(` `,` type($accumulated_value) `)`
^
/home/harsh/llvm-project/mlir/include/mlir/Dialect/Vector/VectorOps.td:2436:1: note: in custom assembly format for this operation
def Vector_ScanOp :
^
note: suggest adding a type constraint to the operation or adding a 'type($dest)' directive to the custom assembly format
$kind `,` $source `,` $initial_value attr-dict `:` `(` type($source) `,` type($initial_value) `)` `to` `(` `,` type($accumulated_value) `)`
^

which is surprising since I already added the type constraint. Any ideas on how to get around this?

2461

Yes, I removed the line.

mlir/lib/Dialect/Vector/VectorOps.cpp
4278

thanks yes I added it in the tablegen file.

4286

If the rank is 4, then doesn't that mean the reduction dimension can only be 0, 1, 2 or 3? So doesn't reduction_dim < rank ?

mlir/lib/Dialect/Vector/VectorTransforms.cpp
2505–2506

So should I check if the kind matches early on and leave this check here? Or did you have something else in mind?

harsh updated this revision to Diff 403834.Jan 27 2022, 4:33 PM
harsh marked 3 inline comments as done.

Updated based on Thomas' comments

Looks good overall, I think you should be able to fix the syntax then it should be good to go.

mlir/include/mlir/Dialect/Vector/VectorOps.td
2460
mlir/lib/Dialect/Vector/VectorOps.cpp
4286

I got mixed, my comment was about the "has to be" it should be "has to be less than"

mlir/lib/Dialect/Vector/VectorTransforms.cpp
2357

This is a bit of a nit but could the two genOperator helper be merged into one and check the type based on Value x instead? This would be a bit less verbose.

2505–2506

I would have a precondition earlier and change this into an assert

harsh marked 3 inline comments as done.Jan 28 2022, 10:06 AM
harsh added inline comments.
mlir/include/mlir/Dialect/Vector/VectorOps.td
2460

Thanks that what I was missing. Now the asm looks nice :)

mlir/lib/Dialect/Vector/VectorOps.cpp
4286

got it. I replaced "has to be <" with "has to be less than".

mlir/lib/Dialect/Vector/VectorTransforms.cpp
2357

Yes, I have added this in the new patch.

2505–2506

sure sounds good

harsh updated this revision to Diff 404074.Jan 28 2022, 10:07 AM
harsh marked 3 inline comments as done.

Updated based on second round of comments

ThomasRaoux accepted this revision.Jan 28 2022, 11:47 AM
This revision is now accepted and ready to land.Jan 28 2022, 11:47 AM
This revision was automatically updated to reflect the committed changes.