This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][Affine][VectorOps] Utility to vectorize loop nest using strategy
ClosedPublic

Authored by dcaballe on Aug 12 2020, 5:30 PM.

Details

Summary

This patch adds a utility based on SuperVectorizer to vectorize an
affine loop nest using a given vectorization strategy. This strategy allows
targeting specific loops for vectorization instead of relying of the
SuperVectorizer analysis to choose the right loops to vectorize.

As part of the refactoring needed in SuperVectorizer to make the necessary
pieces public, we replaced NestedMatcher with SmallVector in some functions as
the mechanism to pass loop candidates over the algorithm. This allows us not to
expose NestedMatcher in the public API so that the client can provide loop
candidates using any other approach.

Please, let me know if you have any other ideas to keep the public API generic.
Another option could be using 'std::vector<SmallVector<AffineForOp, 2>>', to
preserve the parent-children relationship (we use something similar in
utility 'gatherLoops'[1].

[1] https://github.com/llvm/llvm-project/blob/master/mlir/include/mlir/Transforms/LoopUtils.h#L273

Diff Detail

Event Timeline

dcaballe created this revision.Aug 12 2020, 5:30 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 12 2020, 5:30 PM
dcaballe requested review of this revision.Aug 12 2020, 5:30 PM
aartbik added a comment.EditedAug 13 2020, 4:00 PM

Thanks for your CL, Diego. I am not sure if Nicolas will respond soon since he is traveling.

I will try to start a review in the meantime, but I need a little bit more time than Nicolas would to read this carefully :-)

Thanks for your patience. So far I only looked at lowering structured ops into vectorized code, not really into this "super" vectorizer that extracts vector code from scalar loops (even though I am of course a big fan of that :-). It looks like an interesting beginning, and refactoring loop nest vectorization into a public API seems useful.

mlir/include/mlir/Dialect/Affine/Utils.h
39

on a single loop nest

40

since this is now a "public" API, an example in the top level comment would help
(e.g. nest with two loops, and fill in the vector/map below with sample values)

40

Nicolas had the following TODO on the original anonymous structure, you may want to keep this here too, or check with him if this is still the plan

TODO: Hoist to a VectorizationStrategy.cpp when appropriate.

45

typo: vectorize -> vectorized

90

Nest's loops reads a bit awkward. How about

The loops in the loop nest must be ...

mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
857

there is a TAB char here, replace with spaces

862

here and below, again TAB chars

884

having a

} // end for

will improve readability a bit (that, and fixing the TABs)

1253

break after "after" to even this out better

1310

Function -> function

1346

// end for

mlir/test/Dialect/Affine/SuperVectorize/vectorize_2d.mlir
127

can we do better on the map capture?

bondhugula resigned from this revision.Aug 19 2020, 3:46 PM
dcaballe planned changes to this revision.Aug 27 2020, 7:49 PM

Thanks, Aart! Sorry for the delay. I've been OOO. I'll take a look at your comments and I'll also try to use 'std::vector<SmallVector<AffineForOp, 2>>' for the API so that we can preserve some kind of loop nest structure.

mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
862

Sorry about this! Not sure what happened... my editor shouldn't generate tabs at all...

Thanks, Aart! Sorry for the delay. I've been OOO. I'll take a look at your comments

No worries. If you wait a bit longer, Nicolas will be back too :-)

dcaballe updated this revision to Diff 289030.Aug 31 2020, 3:09 PM
dcaballe marked 11 inline comments as done.

Addressed all the comments and replaced the DFS loop container in the API with
an 'std::vector<SmallVector<AffineForOp, 2>>' (we use something similar in other
loop utilities). This container is a bit more friendly for a public API than
providing a plain list of loops in DFS order and also explicitly represents the
nesting level relationship of the loops that we want to vectorized, which is
useful for the vectorization algorithm itself. The vectorizer is therefore updated
to work on this container.

mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
857

Not sure what's going on. I don't see any tab in my editor. I'll reformat the line, just in case. Maybe my patch is replacing an existing tab with spaces or some other weird character? Clang-format seems to be happy with the new version.

862

Same here. Hopefully it's fixed now.

1253

reformatted

1346

This loop is really small, right? Hopefully we don't have to add this comment to all the loops. Is this in the coding guidelines? If that's the case, maybe it's better to 'enforce' that with clang-format (like for namespaces)?

nicolasvasilache accepted this revision.Sep 1 2020, 8:36 AM

Thanks for this @dcaballe, I'm. a big fan of more composable APIs and less blackbox heuristics.

mlir/include/mlir/Dialect/Affine/Utils.h
111

Could you please beef up this example a bit to give some intuition about how things are specified in the second dimension?
The examples so far only show 1 loop in the second dimension.
I think I see how this can represent imperfectly nested loops of different depths but more doc always helps.

mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
863

While you are at it maybe cleanup trivial braces ?
The style was not established at the time this code was written so there are likely a few more in this file.

1142

I agree it is nicer to extract this recursive iteration into its own logic and populate a more easy to use auxiliary data structure.
Thanks for this cleanup!

1351

I think it should be possible to simplify the logic by using isAncestor / isProperAncestor ?

1357

Shouldn't you also assert that for each pair of loops "m, n" in loops[i], they are not nested ?
I think I can see ways to construct 2-D vectors that may break things.

This revision is now accepted and ready to land.Sep 1 2020, 8:36 AM
dcaballe updated this revision to Diff 289521.Sep 2 2020, 11:36 AM
dcaballe marked 2 inline comments as done.

Addressed feedback. Thanks!

Thanks! Any other comments? @aartbik?

mlir/include/mlir/Dialect/Affine/Utils.h
111

Something like this?

mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
863

Sure!

1351

Ok! I think it would be a bit less efficient since to use isAncestor we have to traverse the "ancestry chain" multiple times (the current approach only traverses it once), but I don't expect the number of nesting levels and loops to be huge so it should be fine. The second check suggested below fits better following isAncestor approach here.

mlir/include/mlir/Dialect/Affine/Utils.h
111

I was thinking of something like:

///   affine.for %i0 = 0 to 64 {
///     affine.for %i1 = 0 to 128 {
///       affine.for %i2 = 0 to 512 {
///         %ld = affine.load %in[%i0, %i1, %i2] : memref<64x128x512xf32>
///         affine.store %ld, %out[%i0, %i1, %i2] : memref<64x128x512xf32>
///       }
///     }
///     affine.for %i3 = 0 to 128 {
///       affine.for %i4 = 0 to 1024 {
///         %ld = affine.load %in[%i0, %i3, %i4] : memref<64x128x512xf32>
///         affine.store %ld, %out[%i0, %i3, %i4] : memref<64x128x512xf32>
///       }
///     }
///   }

and then:

/// loops = {{%i0}, {%i2, %i4}}, to vectorize the first, third and fifth loops,

Assuming I got it right .. :) ?

dcaballe updated this revision to Diff 292027.Sep 15 2020, 2:32 PM

Sorry for the delay and sorry, Nicolas, I had written the example that
you were asking for in the .cpp doc but I forgot to copy it to the header
file. This patch is doing that. Please, let me know if there are any other
comments. Thanks!