This is an archive of the discontinued LLVM Phabricator instance.

[mlir][linalg] Refine `tensor.extract` vectorization
ClosedPublic

Authored by awarzynski on Apr 17 2023, 9:06 AM.

Details

Summary

This patch updates the vectorization of the extract Op so that the
permutation map for the transfer_read Op is defined explicitly by the
vectorizer.

This change is needed for cases where the rank of the source tensor is
lower than the rank of the output vector generated by the vectorizer:

mlir
    %17 = vector.transfer_read %arg1[%14, %16], %cst_4 {in_bounds = [true, true]} : tensor<257x24xf32>, vector<1x1x4xf32>

In cases like this, the vectorize will create the following permutation map:

(d0, d1) -> (0, d0, d1)

In other cases the behavior remains unchanged.

Fixes https://github.com/openxla/iree/issues/13036. That's also where
the test case was extracted from.

Diff Detail

Event Timeline

awarzynski created this revision.Apr 17 2023, 9:06 AM
Herald added a project: Restricted Project. · View Herald Transcript
awarzynski requested review of this revision.Apr 17 2023, 9:06 AM
awarzynski edited the summary of this revision. (Show Details)Apr 17 2023, 9:31 AM
tschuett added inline comments.
mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
955

Please z!

dcaballe added inline comments.Apr 17 2023, 10:48 AM
mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
955

I think neither American nor British English is enforced by the guidelines so Vectorised should be fine as long as it's consistent with the rest of the debug messages, which seems to be the case?

992

nit: .

998

It would help if you mentioned that this is to broadcast to the unitary leading dims

mlir/test/Dialect/Linalg/vectorization.mlir
1869

These constants would probably need a -DAG

1879

This should be a contiguous load, right? Are you fixing this separately?

1887

Hmm... How is possible that we generate a contiguous load if there is a gather in its address computation?

awarzynski marked an inline comment as done.Apr 20 2023, 1:36 AM

Thanks for taking a look!

mlir/test/Dialect/Linalg/vectorization.mlir
1879

This should be a contiguous load, right?

Not quite. It's a scalar load which is loop invariant. vector.gather is correct, but very inefficient.

Are you fixing this separately?

Dealing with this vector.gather separately makes more sense to me. Otherwise this patch would be addressing two different issues. Also, I'd like to resolve https://github.com/openxla/iree/issues/13036 before tackling anything else. I should be able to upload a seperate patch shortly.

1887

%extracted_0 = tensor.extract %input_1[%c0, %14] : tensor<1x20xi32> is loop invariant :) I'll add a comment above.

Update and add comments as per PR suggestions

dcaballe accepted this revision.Apr 20 2023, 8:47 AM

Thanks! It make sense!

This revision is now accepted and ready to land.Apr 20 2023, 8:47 AM
This revision was automatically updated to reflect the committed changes.