This is an archive of the discontinued LLVM Phabricator instance.

[mlir][vector] add higher dimensional support to gather/scatter
ClosedPublic

Authored by aartbik on Feb 24 2021, 3:59 PM.

Details

Summary

Similar to mask-load/store and compress/expand, the gather and
scatter operation now allow for higher dimension uses. Note that
to support the mixed-type index, the new syntax is:

vector.gather %base [%i,%j] [%kvector] ....

The first client of this generalization is the sparse compiler,
which needs to define scatter and gathers on dense operands
of higher dimensions too.

Diff Detail

Event Timeline

aartbik created this revision.Feb 24 2021, 3:59 PM
aartbik requested review of this revision.Feb 24 2021, 3:59 PM
aartbik updated this revision to Diff 326231.Feb 24 2021, 4:03 PM

forgot examples in doc

aartbik added inline comments.Feb 24 2021, 7:32 PM
mlir/include/mlir/Dialect/Vector/VectorOps.td
1712–1713

@rriddle I picked this somewhat awkward syntax to be able to express the variadic-index-list followed by single-vector-index using the assemblyFormat syntax

is there a better way? Ideally I would like to simply have

vector.gateher %base[%i,%j,%kvector]

but I had a hard time figuring out how to make the separating comma depend on the first list being non-empty

Hey Aart!

This looks interesting! Quick comment on the semantics. Is the base address computation never expected to have a scalar index for the first dimension? For example:

// Gather takes 2 scalar indices for a 2-d base memref. Would this be allowed?
%0 = vector.gather %X[%i, %j][%aidx], %mask, %pass : memref<10x20xf32>

I think providing as many scalar indices as the rank of the memref might be more aligned with what we can represent in LLVM, where the gather intrinsic takes a based pointer that may also include a scalar offset with respect to the first dimension. I also remember cases where moving computation from the vector indices to the base address yielded better performance. For example:

// 'iv' offset is computed using vector operations, which is unnecessary.
%base = ...
%vector_indices = vector_add(vector_broadcast(%iv), %some_vector_offsets)
%gather = gather(%base, %vector_indices)

vs

// 'iv' offset is part of the base address scalar computation.
%base = ... + iv
gather %base, %some_vector_offsets

Although I'm not sure if LLVM will turn the former into the latter for us.
WDYT?

mlir/include/mlir/Dialect/Vector/VectorOps.td
1637

%0 -> %mask, %1 -> %value?

Hey Aart!

This looks interesting! Quick comment on the semantics. Is the base address computation never expected to have a scalar index for the first dimension?

Yes that would work. The lowering code would actually correctly deal with that (since is reuses existing utilities with this change).
If you find that more intuitive, then I will change the verifier to accept "rank" indices in the first part, and one 1-D vector to add to the last dim.

aartbik updated this revision to Diff 326521.Feb 25 2021, 2:47 PM

used Diego's suggestion:

vector.gather [<indices-for-all-dimensions>] [<index vector>], ....

aartbik added inline comments.Feb 25 2021, 2:49 PM
mlir/include/mlir/Dialect/Vector/VectorOps.td
1712–1713

With the new convention of having all dimensions in the first [ ], it actually makes more sense to have a new syntax with the second [ ] to make the different semantics more apparent.

(sorry, replace "first dimension" with "last dimension" in my prev. comment :))

vector.gather [<indices-for-all-dimensions>] [<index vector>], ....

Thanks! I think that makes more sense to me and now keeping the index vector and scalar indices separate in the assembly format makes more sense.

LGTM! Adding some minor comments.

mlir/include/mlir/Dialect/Vector/VectorOps.td
1662

Should $index be a vector of index types? Or we don't support them yet?

1712–1713

maybe use $index_vec or similar instead of $index so that is not so close to $indices?

1771

same comments for scatter.

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
182

Just trying to understand... why the last stride needs to be 1 and the memory space zero?

mlir/lib/Dialect/Linalg/Transforms/Sparsification.cpp
657

Is the client somehow expecting the last index to be replaced with this new ConstantIndexOp? Otherwise, I would suggest creating a new container and keep args invariant to avoid surprising behavior.

669

nit: return early and remove else, as you do above?

988

nit: not sure if it's in the guidelines but adding messages to asserts is very useful: assert(cond && "...");

1245

is this related to this patch or just sneaked into it? :)

mlir/lib/Dialect/Vector/VectorOps.cpp
2784

nit: maybe I'm wrong and you tried already but I think some of these invariants could be described in ODS with AllElementTypesMatch, AllShapesMatch, and the like?

aartbik marked 9 inline comments as done.Feb 25 2021, 5:52 PM
aartbik added inline comments.
mlir/include/mlir/Dialect/Vector/VectorOps.td
1662

No vector of index yet. Hence this spec.

mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
182

Because otherwise we need to multiply the elements in the index vector by that much.

mlir/lib/Dialect/Linalg/Transforms/Sparsification.cpp
657

This is the only and last use and it was shorter. But yes, it may be cleaner to modify a fresh copy.

988

Agreed that in some cases the message may say something useful, but here the assert is just what it says, it should be an index.

1245

Yes and no. It could go in is own CL if you insist, but it was related to the scatter/gather. I actually wanted to do the whole sparse compiler part as a follow up CL, but due to changed in the builder for scatter and gather, had to change this file anyway. It seemed silly to change it without including the fix.

mlir/lib/Dialect/Vector/VectorOps.cpp
2784

Ah, yes, that is probably possible (a lot of legacy verify tests could probably be expressed with ODS as it is evolving). Let's keep that separate though.

aartbik marked 6 inline comments as done.Feb 25 2021, 5:53 PM

Thanks Diego!

aartbik updated this revision to Diff 326573.Feb 25 2021, 6:21 PM

addressed comments

aartbik updated this revision to Diff 326745.Feb 26 2021, 11:10 AM

lint issue

bixia accepted this revision.Feb 26 2021, 12:00 PM
This revision is now accepted and ready to land.Feb 26 2021, 12:00 PM

Thanks, Aart! I just replied to some comments. Nothing else from my side!
Maybe it would be good that somebody else from your side took a look, as well.

mlir/lib/Dialect/Linalg/Transforms/Sparsification.cpp
988

Agree. However, explaining *why* helps a lot in debugging and reporting bugs, esp. for those that are not familiar with the code. Not sure about the specifics here but saying something like "sparse tensors must have a vector index" or something like that would be helpful. There are similar examples in the guidelines that could help: https://llvm.org/docs/CodingStandards.html#assert-liberally

assert(Ty->isPointerType() && "Can't allocate a non-pointer type!");
assert(isa<PHINode>(Succ->front()) && "Only works on PHId BBs!");
1245

No problem. Just wanted to make sure that this wasn't inadvertently going in :)

aartbik marked an inline comment as done.Feb 26 2021, 12:57 PM
aartbik added inline comments.
mlir/lib/Dialect/Linalg/Transforms/Sparsification.cpp
988

Okay, added a string.

aartbik updated this revision to Diff 326780.Feb 26 2021, 1:01 PM
aartbik marked an inline comment as done.

added strings to asserts

This revision was landed with ongoing or failed builds.Feb 26 2021, 2:20 PM
This revision was automatically updated to reflect the committed changes.