This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Vector] Thread 0-d vectors through vector.transfer ops
ClosedPublic

Authored by nicolasvasilache on Nov 30 2021, 7:46 AM.

Details

Summary

This revision adds 0-d vector support to vector.transfer ops.
In the process, numerous cleanups are applied, in particular around normalizing
and reducing the number of builders.

Diff Detail

Event Timeline

nicolasvasilache requested review of this revision.Nov 30 2021, 7:46 AM
Herald added a project: Restricted Project. · View Herald Transcript
ThomasRaoux added inline comments.Nov 30 2021, 8:03 AM
mlir/test/Dialect/Vector/ops.mlir
7–14

Can 0-D vector be read from non 0-D tensor? It would be good to add a test for it.

Address comment.
Use mandatory empty AffineMapAttr instead of optional null to reduce corner cases as suggested offline by @ThomasRaoux.

nicolasvasilache marked an inline comment as done.Nov 30 2021, 1:56 PM
springerm accepted this revision.Dec 1 2021, 1:42 AM
springerm added inline comments.
mlir/include/mlir/Dialect/Vector/VectorOps.td
1157

why did this change?

1329

extract from

1464

insert into

This revision is now accepted and ready to land.Dec 1 2021, 1:42 AM

Thanks for helping with the 0-d vector problem! Added some minor comments. I have a question about the lowering of 0-d vectors to the scalar world. See comments inline.

mlir/include/mlir/Dialect/Vector/VectorOps.td
1157

+1, same for xfer write

1306

and an?

mlir/lib/Conversion/VectorToROCDL/VectorToROCDL.cpp
69

These checks on GPU and ROCDL mean that 0-d tensors are not expected at all by these backends or it's more a TODO to be implemented in the future? If the latter, we should add a TODO comment staying that it will be implemented in the future.

mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
230

This looks like we are vectorizing the operand to be stored in place. Shouldn't this call to some generic function that takes care of that (vectorizeOneOp or some other one)?

mlir/lib/Dialect/Vector/VectorOps.cpp
2317

and an?

mlir/lib/Dialect/Vector/VectorTransforms.cpp
2804

Move comment before the if?

2870

load?

2888

Question about this lowering: why are we lowering 0-d vectors to the scalar world? I thought one of the goals discussed in the RFC was to keep everything within the vector world to avoid the scalar<->vector transition that could be very expensive for some targets. Shouldn't we lower 0-d vectors to a `vector<1xtype> in LLVM instead?

ThomasRaoux accepted this revision.Dec 1 2021, 7:02 AM
ThomasRaoux added inline comments.
mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp
74–76

we check that the rank must be 2 right below so this won't really change anything. I'm assuming you are just adding it from consistency? (same below)

nicolasvasilache marked 12 inline comments as done.Dec 1 2021, 8:24 AM
nicolasvasilache added inline comments.
mlir/include/mlir/Dialect/Vector/VectorOps.td
1157

Because the comment was off by 1.
The thinking is that if you start counting operands at 0 and:

  • plug in rank == 0 you now get [1 .. 1) which is

empty as expected; before it used to be 2 .. 1 which is weird

  • plug in rank == 1, you now get [1 .. 2) instead of the more ambiguous 2 .. 2.
1329

dead now

1464

dead

mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp
74–76

Yes the evolution has been that I broke the API to make sure by way of compiler crashes that I am not missing any entry. Then I just copied this everywhere.

In this very special case I can just remove indeed.

mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
230

This is a side-effect of the piece below that says:

// Not all ops support 0-d vectors, extract the scalar for now.
// TODO: remove this.
if (readValue.getType().cast<VectorType>().getRank() == 0)
  readValue = b.create<vector::ExtractElementOp>(loc, readValue);

Both should go away once all ops support 0-d vectors.

mlir/lib/Dialect/Vector/VectorTransforms.cpp
2888

Note that this is not changing any behavior here and we were already doing this extraction for vector<1x..x1xtype>.
Your point is very valid but I think is beyond this CL: we need to either:

  • go directly to LLVM + bitcast
  • introduce a bitcast
  • let memref.load/store additionally support the 0-d vector case which itself may be quite fraught with issues

There are still deeper data layout issues lingering even in this trivial case (for architectures for which this matters that is .. ).
For now I don't think I can do better but adding a TODO.

nicolasvasilache marked 6 inline comments as done.

Address comments and rebase.

This revision was landed with ongoing or failed builds.Dec 1 2021, 8:49 AM
This revision was automatically updated to reflect the committed changes.