This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Vector/
-
mlir/
-
Dialect/
-
Vector/
15/23
VectorOps.td
4/8
VectorTransforms.h
-
lib/Dialect/Vector/
-
Dialect/
-
Vector/
-
VectorOps.cpp
5/8
VectorTransforms.cpp
-
test/
-
Dialect/Vector/
-
Vector/
-
invalid.mlir
-
ops.mlir
-
vector-distribution.mlir
-
lib/Transforms/
-
Transforms/
3/6
TestVectorTransforms.cpp

Differential D88341

[mlir][vector] First step of vector distribution transformation
ClosedPublic

Authored by ThomasRaoux on Sep 25 2020, 2:49 PM.

Download Raw Diff

Details

Reviewers

nicolasvasilache
ftynse
aartbik
mravishankar

Commits

rGdd14e5825209: [mlir][vector] First step of vector distribution transformation

Summary

This is the first of several step to support distributing large vectors. This adds instructions extract_map and insert_map that allow us to do incremental lowering. Right now the transformation only apply to simple pointwise operation with a vector size matching the dimension of the IDs used to distribute the vector.
This can be used to distribute large vectors to loops or SPMD.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ThomasRaoux created this revision.Sep 25 2020, 2:49 PM

Herald added a reviewer: aartbik. · View Herald TranscriptSep 25 2020, 2:49 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: tatianashp, msifontes, jurahul and 13 others. · View Herald Transcript

ThomasRaoux requested review of this revision.Sep 25 2020, 2:49 PM

Herald added a subscriber: stephenneuendorffer. · View Herald TranscriptSep 25 2020, 2:49 PM

In general the semantics of this op isn't clear to me. It seems that we "hide" the SIMT aspect by "faking" a scalar computation.
Have you looked into structuring this in a region instead?

func @distribute_vector_add(%A: vector<32xf32>, %B: vector<32xf32>) -> vector<32xf32> {
  %0 = addf %A, %B : vector<32xf32>
  return %0: vector<32xf32>
}

func @distribute_vector_add(%A: vector<32xf32>, %B: vector<32xf32>) -> vector<32xf32> {
  %C = simt.execute %A to %Aelt : vector<1xf32> affine_map<()[s0] -> (s0)> 
                                   %B to %Belt : vector<1xf32> affine_map<()[s0] -> (s0)> {
    %0 = addf %A, %B : vector<1xf32>
    yield %0
  }
  return %C: vector<32xf32>
}

In D88341#2296114, @mehdi_amini wrote:
In general the semantics of this op isn't clear to me. It seems that we "hide" the SIMT aspect by "faking" a scalar computation.
Have you looked into structuring this in a region instead?
func @distribute_vector_add(%A: vector<32xf32>, %B: vector<32xf32>) -> vector<32xf32> {
  %0 = addf %A, %B : vector<32xf32>
  return %0: vector<32xf32>
}
->
func @distribute_vector_add(%A: vector<32xf32>, %B: vector<32xf32>) -> vector<32xf32> {
  %C = simt.execute %A to %Aelt : vector<1xf32> affine_map<()[s0] -> (s0)> 
                                   %B to %Belt : vector<1xf32> affine_map<()[s0] -> (s0)> {
    %0 = addf %A, %B : vector<1xf32>
    yield %0
  }
  return %C: vector<32xf32>
}

I haven't tried using a region. The downside of using region is that it makes the canonicalization harder. This is just the first step and those operations are meant to be transient ops that will eventually get combined with memory accesses. What do you think?

Using SIMT regions would be like creating parallel loop around the code. Here I'm trying distribute the vector on a specific ID.

That being said, @nicolasvasilache might have a better answer since he had the original idea of having such operations to allow incremental lowering for vector distribution.

Thanks for starting this @ThomasRaoux.

In D88341#2296114, @mehdi_amini wrote:
In general the semantics of this op isn't clear to me. It seems that we "hide" the SIMT aspect by "faking" a scalar computation.
Have you looked into structuring this in a region instead?
func @distribute_vector_add(%A: vector<32xf32>, %B: vector<32xf32>) -> vector<32xf32> {
  %0 = addf %A, %B : vector<32xf32>
  return %0: vector<32xf32>
}
->
func @distribute_vector_add(%A: vector<32xf32>, %B: vector<32xf32>) -> vector<32xf32> {
  %C = simt.execute %A to %Aelt : vector<1xf32> affine_map<()[s0] -> (s0)> 
                                   %B to %Belt : vector<1xf32> affine_map<()[s0] -> (s0)> {
    %0 = addf %A, %B : vector<1xf32>
    yield %0
  }
  return %C: vector<32xf32>
}

Parallel semantics are orthogonal to the mechanisms underlying the composition of patterns that propagate distribution of SSA values to address computations.
In particular, the patterns will also be useful to break down big vectors into smaller ones going through a loop + memory.
These will be run and tested on single-thread scalar CPU code.

Users of this pattern will include transformations that create ops with regions with parallel semantics.
I envision this is the point when the question that @mehdi_amini raises will come into play.
I would imagine reusing the abstraction that makes sense at the GPU-level for this (and which should be more general and e.g. support control-flow, divergence, synchronizations etc)?

mlir/include/mlir/Dialect/Vector/VectorOps.td
459	I would rename `dimension` to `multiplicity`. In future revisions, I expect this to become variadic in either `id`, `dimension` or both.
464	`vector size / multiplicity`
465	I'd rephrase to `a Value such as a loop induction variable or an SPMD id`.
468	`s/merged with/folded into` ? You may also want to have something along the lines of `Similarly to vector.tuple_get, this operation is used for progressive lowering and should be folded away before converting to LLVM`.
474	How about a syntax like: `vector.extract_map optional_map %v[%id:32]` So that in the future we could have: `vector.extract_map optional_map %v[%id0:32][%id1:16]` Or even, when it makes sense: `vector.extract_map optional_map %v[%id0:%sz0][%id1:%sz1]`
702	`vector size * multiplicity`
703	Same as above re value, induction variables and SPMD id.
705	Same as above.
mlir/include/mlir/Dialect/Vector/VectorTransforms.h
175	Why not just take a `Value` here ? Do you foresee actually creating the Value on the fly in the intended long-term usage? In the case that the `Value` is a loop induction variable this would look fishy. If it is just for testing purposes, we can write tests + passes that expect a certain structure and takes advantage of that.
201	This will create many patterns when instantiated on many possible pointwise vector ops. How about creating a `vector::PointwiseInterface` and adding it to proper operations ? Not necessary to do it in this PR but let's do it in the immediate next PR.
201	I was thinking of structuring this differently: DistributeVectorPattern<Op> only adds a pair of extract_map, insert_map after the op. then canonicalization patterns quick-in greedily: extract_map "go up" until they fold into a transfer_read; insert_map "go down" until they fold into a transfer_write. Triggering the transformation can be done in a few ways: either have a transform / pass explicitly call distribution to insert the pair of ops and later apply the canonicalization patterns (in that case the match failure condition is when a vector op flows into an extract_map op). add additional filtering constraints so that the patterns applies only in certain conditions: e.g. based on type (vector op of size "n", presence of an attribute, keep a set of "seen" ops in the pattern). traverse the use-def chains until we see a use of the appropriate `%id`, in which case bail. However, this is likely to become expensive and attribute/seen op from 2. is a memoization of this. The advantage is we only need to implement 2 canonicalization patterns (one for pull extract up through an op, the other for pushing insert down through an op) instead of 3 (insert + extract + rewrite op as "extract-op-insert"). This will be quite simpler once we have to start dealing with permutation / indexing maps.
mlir/lib/Dialect/Vector/VectorTransforms.cpp
2421	Can we turn this into a canonicalization pattern on ExtractMapOp?
2436	I don't think this is necessary, see my general comment above.
2453	You shouldn't have this called by the rewrite pattern. Code should be restructured to avoid this.
mlir/test/lib/Transforms/TestVectorTransforms.cpp
137	I would restructure this test pass to just take the first index of the enclosing function and use that as the value to distribute on. I would use a hardcoded custom op "test_distribution_value" that just takes an index. In the test IR you can then just have a func argument `%arg0 : index` and do `test_distribution_value(%arg0)`. In the future, we can plug that to a loop induction variable and write an integration test.

This revision now requires changes to proceed.Sep 28 2020, 1:35 AM

Maybe remove SPMD from the title and commit message as this is intended to also be useful for sequential vector code?

aartbik added inline comments.Sep 28 2020, 11:08 AM

mlir/include/mlir/Dialect/Vector/VectorOps.td
463	1-D to be consistent with other spelling in this file
465	this is an extremely concise description, can you provide a bit more details an example would be helpful too
698	this is a copy-and-paste left over
700	1-D

mravishankar requested changes to this revision.Sep 28 2020, 11:33 AM

mravishankar added a subscriber: mravishankar.

mravishankar added inline comments.

mlir/include/mlir/Dialect/Vector/VectorOps.td
457	This name is confusing cause some of the terms are overloaded. My first instinct when seeing `map` was to expect an `affine_map`, but that is not the case. And there is no "extraction" happening here. How about `Vector_DistributeSliceOp`
694	Similar to above here, but this is more complicated. Each distributed ID is inserting a slice of the vector. Are the semantics of the operation that this is a "insert and broadcast", i.e. all distributed IDs have access to the resulting inserted value? I feel like this has some implicit synchronization behavior. Are all the threads "inserting" here expected to be synchronized at this point?

Address review comments.

ThomasRaoux marked 11 inline comments as done.Sep 28 2020, 1:00 PM

ThomasRaoux added inline comments.

mlir/include/mlir/Dialect/Vector/VectorOps.td
465	I added examples and added more details based on Nicolas comments. Let me know if you think this is still hard to understand.
474	I changed the syntax as suggested. I'm planning to add the optional map in following patches.
mlir/include/mlir/Dialect/Vector/VectorTransforms.h
175	I was thinking we will need to create Value on the fly at some point which is why I had it that way. But I'm not sure this will be needed right now so I changed it to just pass a Value for now.
201	I went for solution 1. Let me know if this is different than what you had in mind.
mlir/lib/Dialect/Vector/VectorTransforms.cpp
2436	Based on your general comment, the pattern is gone and I only left the transformation that can be called directly.
2453	Removed this pattern based on other comments.
mlir/test/lib/Transforms/TestVectorTransforms.cpp
137	Ok, changed it to do that.

ThomasRaoux retitled this revision from [mlir][vector] First step of vector SPMD distribution transformation to [mlir][vector] First step of vector distribution transformation.Sep 28 2020, 1:17 PM

ThomasRaoux edited the summary of this revision. (Show Details)

ThomasRaoux added inline comments.Sep 28 2020, 1:22 PM

mlir/include/mlir/Dialect/Vector/VectorOps.td
457	The goal is to have an `affine_map` indeed. This is the just the first step and I will add an affine_map operand as well. This doesn't really distribute so I'm not sure about calling it distributeSliceOp. Maybe should be called getSliceOp or something similar? @nicolasvasilache, any thoughts on the best name for this op?
694	This doesn't imply anything on synchronization. I think I made it confusing by mentioning SPMD but this is not directly related. Whether or not the different elements are synchronized depends on the semantic of the program. For instance this could be used to distribute elements over a serialized loop, the lowering patterns should do the right thing to not break the semantic if cross lane operations are needed. Do you think this is something that needs to be added to the description or the latest updates are enough to clarify that this doesn't imply SPMD?

mravishankar requested changes to this revision.Sep 28 2020, 3:00 PM

mravishankar added inline comments.

mlir/include/mlir/Dialect/Vector/VectorOps.td
694	This makes sense. Thanks for the clarification.
mlir/lib/Dialect/Vector/VectorTransforms.cpp
2453	For such methods it is better to just return `extract` and `newVec` and let the caller deal with the operations. Similarly it is better to pass in the `OpBuilder` as an argument. If used within the `DialectConversion` pass the `ConversionPatternRewriter` needs to track operations added/deleted so that it can undo on failure.
mlir/test/lib/Transforms/TestVectorTransforms.cpp
138	I think if there were an Operation interface for all relevant ops, then this could be done as a pattern which would be highly preferable.

This revision now requires changes to proceed.Sep 28 2020, 3:00 PM

Forgot to add context.

ThomasRaoux added inline comments.Sep 28 2020, 3:11 PM

mlir/lib/Dialect/Vector/VectorTransforms.cpp
2453	Sure, I can change that. I'll wait for your second comment to be resolved first as this will work differently based on whether I use a pattern or do direct transformation.
mlir/test/lib/Transforms/TestVectorTransforms.cpp
138	The reason I'm not using a Pattern is not because there no Operation interface but because the rewrite pattern would run into infinite loop since this transformation leaves the original instruction unchanged and just adds extract_map/insert_map that then get propagated by the canonicalization operations. (See discussion with Nicolas about that: https://reviews.llvm.org/D88341#inline-819945) Here I went for the solution 1. suggested by Nicolas. Using solution 2. would allow making it a pattern however I'm not sure if it is really cleaner. What do you think?

Change signature of distributPointwiseVectorOp based on comments.

Thanks for the feedback, I think I addressed everything. Please take another look.

mlir/lib/Dialect/Vector/VectorTransforms.cpp
2453	I changed the signature to return the extract/insert and do the replace outside the function.

ThomasRaoux updated this revision to Diff 295072.Sep 29 2020, 11:54 AM

Looks fine to me. Please wait for Aart or Nicolas to approve.

mlir/include/mlir/Dialect/Vector/VectorTransforms.h
186	Last comment here (sorry for the multiple rounds); return by reference is weird. In Linalg this was done by creating a struct like struct Foo { ExtractMapOp extract; InsertMapOp insert; } and return type of the method being `Optional<Foo>` . I am not aware of the convention here, but I find this better. Return by reference typically make sense for containers like vector, set, map, etc. to avoid a copy (though C++ semantics now is to do a move on return of such objects rather than a copy).

Remove return by reference.

mlir/include/mlir/Dialect/Vector/VectorTransforms.h
186	Changed it.

LGTM, if a better name pops to mind please do propose.

mlir/include/mlir/Dialect/Vector/VectorOps.td
457	I agree the name is not good but I can't think of a better one. We actually need 2 names not just one. I originally went for insert/extract + map to mirror the existing insert/extract which behave similarly in LLVM. There is an insertion/extraction of value happening: only 1 value is mapped. So in the current form, this overlaps with insert/extract_element. In the absence of a better proposal I'd stay with this name for now but I'd very much like a better name..
mlir/include/mlir/Dialect/Vector/VectorTransforms.h
181	`s/going from 0 to dimension/taking all values in [0 .. multiplicity - 1] (e.g. loop induction variable or SPMD id)` ? Not sure if this phrasing would be clear enough but I'd like to emphasize that this is a special type of value that must take all values in the range. Please us multiplicity here and below for consistency.
mlir/test/lib/Transforms/TestVectorTransforms.cpp
138	The pattern is the one that propagates the insert/extract up and down and will benefit from the Interfaces. It seems to me the addition of insert/extract pairs is the responsibility of the pass/transformation which seems appropriate for the current state of the test pass. I'd say we can revisit this later when we have enough patterns and we start looking at profitability.

This revision is now accepted and ready to land.Sep 30 2020, 3:21 AM

Re-phrase comment and replace dimension with multiplicity.

ThomasRaoux marked an inline comment as done.Sep 30 2020, 12:47 PM

ThomasRaoux added inline comments.

mlir/include/mlir/Dialect/Vector/VectorOps.td
457	I'll leave it like that for now then and we can think about a better name.
mlir/test/lib/Transforms/TestVectorTransforms.cpp
138	Leaving it like that for now.

Closed by commit rGdd14e5825209: [mlir][vector] First step of vector distribution transformation (authored by ThomasRaoux). · Explain WhySep 30 2020, 1:16 PM

This revision was automatically updated to reflect the committed changes.

ThomasRaoux added a commit: rGdd14e5825209: [mlir][vector] First step of vector distribution transformation.

I find these op semantics really hard to understand right now, I think the doc needs a significant amount of improvement.

I am very unconvinced by this, but this can be just a lack of understanding, can you please bring this up to Discourse and complete a design discussion on this topic?

mlir/include/mlir/Dialect/Vector/VectorOps.td
503	This IR and this transformation seems deeply broken to me at the moment.

In D88341#2297260, @ThomasRaoux wrote:
In D88341#2296114, @mehdi_amini wrote:
->
func @distribute_vector_add(%A: vector<32xf32>, %B: vector<32xf32>) -> vector<32xf32> {
%C = simt.execute %A to %Aelt : vector<1xf32> affine_map<()[s0] -> (s0)> 
                                 %B to %Belt : vector<1xf32> affine_map<()[s0] -> (s0)> {
  %0 = addf %A, %B : vector<1xf32>
  yield %0
}

This op's custom print is missing a 32 somewhere. Isn't %C of type vector 32xf32? Not having that goes against IR printing guidelines.

bondhugula added inline comments.Oct 9 2020, 12:01 AM

mlir/include/mlir/Dialect/Vector/VectorOps.td
461	Why are these ops called `extract_map` and `insert_map`?! They aren't extracting or inserting maps nor are they extracting and mapping. The phrase "maps a given multiplicity of the vector ..." isn't really meaningful to me! Why is `id` used? Did you mean position or index? Both the naming and the doc description appear to be completely unclear and perhaps disconnected from the op's semantics.

In D88341#2320755, @mehdi_amini wrote:

I find these op semantics really hard to understand right now, I think the doc needs a significant amount of improvement.

I am very unconvinced by this, but this can be just a lack of understanding, can you please bring this up to Discourse and complete a design discussion on this topic?

Ping here? (without acknowledgement I can't know if this has been missed)

Herald added a subscriber: rdzhabarov. · View Herald TranscriptOct 12 2020, 8:42 PM

In D88341#2326604, @mehdi_amini wrote:

In D88341#2320755, @mehdi_amini wrote:

I find these op semantics really hard to understand right now, I think the doc needs a significant amount of improvement.

I am very unconvinced by this, but this can be just a lack of understanding, can you please bring this up to Discourse and complete a design discussion on this topic?

Ping here? (without acknowledgement I can't know if this has been missed)

Sorry for the delay. I'm starting a Discourse discussion and can update the doc based on it.

In D88341#2326636, @ThomasRaoux wrote:

In D88341#2326604, @mehdi_amini wrote:

In D88341#2320755, @mehdi_amini wrote:

I find these op semantics really hard to understand right now, I think the doc needs a significant amount of improvement.

I am very unconvinced by this, but this can be just a lack of understanding, can you please bring this up to Discourse and complete a design discussion on this topic?

Ping here? (without acknowledgement I can't know if this has been missed)

Sorry for the delay. I'm starting a Discourse discussion and can update the doc based on it.

https://llvm.discourse.group/t/vector-vector-distribution-large-vector-to-small-vector/1983

Thanks!

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Vector/

VectorOps.td

105 lines

VectorTransforms.h

41 lines

lib/

Dialect/

Vector/

VectorOps.cpp

47 lines

VectorTransforms.cpp

34 lines

test/

Dialect/

Vector/

invalid.mlir

28 lines

ops.mlir

11 lines

vector-distribution.mlir

13 lines

lib/

Transforms/

TestVectorTransforms.cpp

26 lines

Diff 295391

mlir/include/mlir/Dialect/Vector/VectorOps.td

Show First 20 Lines • Show All 448 Lines • ▼ Show 20 Lines	let extraClassDeclaration = [{
static StringRef getStridesAttrName() { return "strides"; }		static StringRef getStridesAttrName() { return "strides"; }
}];		}];
let assemblyFormat = [{		let assemblyFormat = [{
$vector `,` $sizes `,` $strides attr-dict `:` type($vector) `into`		$vector `,` $sizes `,` $strides attr-dict `:` type($vector) `into`
type(results)		type(results)
}];		}];
}		}

		def Vector_ExtractMapOp :
		mravishankarUnsubmitted Not Done Reply Inline Actions This name is confusing cause some of the terms are overloaded. My first instinct when seeing `map` was to expect an `affine_map`, but that is not the case. And there is no "extraction" happening here. How about `Vector_DistributeSliceOp` mravishankar: This name is confusing cause some of the terms are overloaded. My first instinct when seeing…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions The goal is to have an `affine_map` indeed. This is the just the first step and I will add an affine_map operand as well. This doesn't really distribute so I'm not sure about calling it distributeSliceOp. Maybe should be called getSliceOp or something similar? @nicolasvasilache, any thoughts on the best name for this op? ThomasRaoux: The goal is to have an `affine_map` indeed. This is the just the first step and I will add an…
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions I agree the name is not good but I can't think of a better one. We actually need 2 names not just one. I originally went for insert/extract + map to mirror the existing insert/extract which behave similarly in LLVM. There is an insertion/extraction of value happening: only 1 value is mapped. So in the current form, this overlaps with insert/extract_element. In the absence of a better proposal I'd stay with this name for now but I'd very much like a better name.. nicolasvasilache: I agree the name is not good but I can't think of a better one. We actually need 2 names not…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions I'll leave it like that for now then and we can think about a better name. ThomasRaoux: I'll leave it like that for now then and we can think about a better name.
		Vector_Op<"extract_map", [NoSideEffect]>,
		Arguments<(ins AnyVector:$vector, Index:$id, I64Attr:$multiplicity)>,
		nicolasvasilacheUnsubmitted Done Reply Inline Actions I would rename `dimension` to `multiplicity`. In future revisions, I expect this to become variadic in either `id`, `dimension` or both. nicolasvasilache: I would rename `dimension` to `multiplicity`. In future revisions, I expect this to become…
		Results<(outs AnyVector)> {
		let summary = "vector extract map operation";
		bondhugulaUnsubmitted Not Done Reply Inline Actions Why are these ops called `extract_map` and `insert_map`?! They aren't extracting or inserting maps nor are they extracting and mapping. The phrase "maps a given multiplicity of the vector ..." isn't really meaningful to me! Why is `id` used? Did you mean position or index? Both the naming and the doc description appear to be completely unclear and perhaps disconnected from the op's semantics. bondhugula: Why are these ops called `extract_map` and `insert_map`?! They aren't extracting or inserting…
		let description = [{
		Takes an 1-D vector and extract a sub-part of the vector starting at id with
		aartbikUnsubmitted Done Reply Inline Actions 1-D to be consistent with other spelling in this file aartbik: 1-D to be consistent with other spelling in this file
		a size of `vector size / multiplicity`. This maps a given multiplicity of
		nicolasvasilacheUnsubmitted Done Reply Inline Actions `vector size / multiplicity` nicolasvasilache: `vector size / multiplicity`
		the vector to a Value such as a loop induction variable or an SPMD id.
		nicolasvasilacheUnsubmitted Done Reply Inline Actions I'd rephrase to `a Value such as a loop induction variable or an SPMD id`. nicolasvasilache: I'd rephrase to `a Value such as a loop induction variable or an SPMD id`.
		aartbikUnsubmitted Not Done Reply Inline Actions this is an extremely concise description, can you provide a bit more details an example would be helpful too aartbik: this is an extremely concise description, can you provide a bit more details an example would…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions I added examples and added more details based on Nicolas comments. Let me know if you think this is still hard to understand. ThomasRaoux: I added examples and added more details based on Nicolas comments. Let me know if you think…

		Similarly to vector.tuple_get, this operation is used for progressive
		lowering and should be folded away before converting to LLVM.
		nicolasvasilacheUnsubmitted Done Reply Inline Actions `s/merged with/folded into` ? You may also want to have something along the lines of `Similarly to vector.tuple_get, this operation is used for progressive lowering and should be folded away before converting to LLVM`. nicolasvasilache: `s/merged with/folded into` ? You may also want to have something along the lines of…


		For instance, the following code:
		```mlir
		%a = vector.transfer_read %A[%c0]: memref<32xf32>, vector<32xf32>
		%b = vector.transfer_read %B[%c0]: memref<32xf32>, vector<32xf32>
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions How about a syntax like: `vector.extract_map optional_map %v[%id:32]` So that in the future we could have: `vector.extract_map optional_map %v[%id0:32][%id1:16]` Or even, when it makes sense: `vector.extract_map optional_map %v[%id0:%sz0][%id1:%sz1]` nicolasvasilache: How about a syntax like: `vector.extract_map optional_map %v[%id:32]` So that in the future we…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions I changed the syntax as suggested. I'm planning to add the optional map in following patches. ThomasRaoux: I changed the syntax as suggested. I'm planning to add the optional map in following patches.
		%c = addf %a, %b: vector<32xf32>
		vector.transfer_write %c, %C[%c0]: memref<32xf32>, vector<32xf32>
		```
		can be rewritten to:
		```mlir
		%a = vector.transfer_read %A[%c0]: memref<32xf32>, vector<32xf32>
		%b = vector.transfer_read %B[%c0]: memref<32xf32>, vector<32xf32>
		%ea = vector.extract_map %a[%id : 32] : vector<32xf32> to vector<1xf32>
		%eb = vector.extract_map %b[%id : 32] : vector<32xf32> to vector<1xf32>
		%ec = addf %ea, %eb : vector<1xf32>
		%c = vector.insert_map %ec, %id, 32 : vector<1xf32> to vector<32xf32>
		vector.transfer_write %c, %C[%c0]: memref<32xf32>, vector<32xf32>
		```

		Where %id can be an induction variable or an SPMD id going from 0 to 31.

		And then be rewritten to:
		```mlir
		%a = vector.transfer_read %A[%id]: memref<32xf32>, vector<1xf32>
		%b = vector.transfer_read %B[%id]: memref<32xf32>, vector<1xf32>
		%c = addf %a, %b: vector<1xf32>
		vector.transfer_write %c, %C[%id]: memref<32xf32>, vector<1xf32>
		```

		Example:

		```mlir
		%ev = vector.extract_map %v[%id:32] : vector<32xf32> to vector<1xf32>
		```
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions This IR and this transformation seems deeply broken to me at the moment. mehdi_amini: This IR and this transformation seems deeply broken to me at the moment.
		}];
		let builders = [OpBuilder<
		"OpBuilder &builder, OperationState &result, " #
		"Value vector, Value id, int64_t multiplicity">];
		let extraClassDeclaration = [{
		VectorType getSourceVectorType() {
		return vector().getType().cast<VectorType>();
		}
		VectorType getResultType() {
		return getResult().getType().cast<VectorType>();
		}
		}];
		let assemblyFormat = [{
		$vector `[` $id `:` $multiplicity `]` attr-dict `:` type($vector) `to`
		type(results)
		}];
		}

def Vector_FMAOp :		def Vector_FMAOp :
Op<Vector_Dialect, "fma", [NoSideEffect,		Op<Vector_Dialect, "fma", [NoSideEffect,
AllTypesMatch<["lhs", "rhs", "acc", "result"]>]>,		AllTypesMatch<["lhs", "rhs", "acc", "result"]>]>,
Arguments<(ins AnyVector:$lhs, AnyVector:$rhs, AnyVector:$acc)>,		Arguments<(ins AnyVector:$lhs, AnyVector:$rhs, AnyVector:$acc)>,
Results<(outs AnyVector:$result)> {		Results<(outs AnyVector:$result)> {
let summary = "vector fused multiply-add";		let summary = "vector fused multiply-add";
let description = [{		let description = [{
Multiply-add expressions operate on n-D vectors and compute a fused		Multiply-add expressions operate on n-D vectors and compute a fused
▲ Show 20 Lines • Show All 156 Lines • ▼ Show 20 Lines	let extraClassDeclaration = [{
static StringRef getStridesAttrName() { return "strides"; }		static StringRef getStridesAttrName() { return "strides"; }
}];		}];
let assemblyFormat = [{		let assemblyFormat = [{
$vectors `,` $sizes `,` $strides attr-dict `:` type($vectors) `into`		$vectors `,` $sizes `,` $strides attr-dict `:` type($vectors) `into`
type(results)		type(results)
}];		}];
}		}

		def Vector_InsertMapOp :
		mravishankarUnsubmitted Not Done Reply Inline Actions Similar to above here, but this is more complicated. Each distributed ID is inserting a slice of the vector. Are the semantics of the operation that this is a "insert and broadcast", i.e. all distributed IDs have access to the resulting inserted value? I feel like this has some implicit synchronization behavior. Are all the threads "inserting" here expected to be synchronized at this point? mravishankar: Similar to above here, but this is more complicated. Each distributed ID is inserting a slice…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions This doesn't imply anything on synchronization. I think I made it confusing by mentioning SPMD but this is not directly related. Whether or not the different elements are synchronized depends on the semantic of the program. For instance this could be used to distribute elements over a serialized loop, the lowering patterns should do the right thing to not break the semantic if cross lane operations are needed. Do you think this is something that needs to be added to the description or the latest updates are enough to clarify that this doesn't imply SPMD? ThomasRaoux: This doesn't imply anything on synchronization. I think I made it confusing by mentioning SPMD…
		mravishankarUnsubmitted Not Done Reply Inline Actions This makes sense. Thanks for the clarification. mravishankar: This makes sense. Thanks for the clarification.
		Vector_Op<"insert_map", [NoSideEffect]>,
		Arguments<(ins AnyVector:$vector, Index:$id, I64Attr:$multiplicity)>,
		Results<(outs AnyVector)> {
		let summary = "vector insert map operation";
		aartbikUnsubmitted Done Reply Inline Actions this is a copy-and-paste left over aartbik: this is a copy-and-paste left over
		let description = [{
		insert an 1-D vector and within a larger vector starting at id. The new
		aartbikUnsubmitted Done Reply Inline Actions 1-D aartbik: 1-D
		vector created will have a size of `vector size * multiplicity`. This
		represents how a sub-part of the vector is written for a given Value such as
		nicolasvasilacheUnsubmitted Done Reply Inline Actions `vector size * multiplicity` nicolasvasilache: `vector size * multiplicity`
		a loop induction variable or an SPMD id.
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Same as above re value, induction variables and SPMD id. nicolasvasilache: Same as above re value, induction variables and SPMD id.

		Similarly to vector.tuple_get, this operation is used for progressive
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Same as above. nicolasvasilache: Same as above.
		lowering and should be folded away before converting to LLVM.

		This operations is meant to be used in combination with vector.extract_map.
		See example in extract.map description.

		Example:

		```mlir
		%v = vector.insert_map %ev, %id, 32 : vector<1xf32> to vector<32xf32>
		```
		}];
		let builders = [OpBuilder<
		"OpBuilder &builder, OperationState &result, " #
		"Value vector, Value id, int64_t multiplicity">];
		let extraClassDeclaration = [{
		VectorType getSourceVectorType() {
		return vector().getType().cast<VectorType>();
		}
		VectorType getResultType() {
		return getResult().getType().cast<VectorType>();
		}
		}];
		let assemblyFormat = [{
		$vector `,` $id `,` $multiplicity attr-dict `:` type($vector) `to`
		type(results)
		}];
		}

def Vector_InsertStridedSliceOp :		def Vector_InsertStridedSliceOp :
Vector_Op<"insert_strided_slice", [NoSideEffect,		Vector_Op<"insert_strided_slice", [NoSideEffect,
PredOpTrait<"operand #0 and result have same element type",		PredOpTrait<"operand #0 and result have same element type",
TCresVTEtIsSameAsOpBase<0, 0>>,		TCresVTEtIsSameAsOpBase<0, 0>>,
AllTypesMatch<["dest", "res"]>]>,		AllTypesMatch<["dest", "res"]>]>,
Arguments<(ins AnyVector:$source, AnyVector:$dest, I64ArrayAttr:$offsets,		Arguments<(ins AnyVector:$source, AnyVector:$dest, I64ArrayAttr:$offsets,
I64ArrayAttr:$strides)>,		I64ArrayAttr:$strides)>,
Results<(outs AnyVector:$res)> {		Results<(outs AnyVector:$res)> {
▲ Show 20 Lines • Show All 1,321 Lines • Show Last 20 Lines

mlir/include/mlir/Dialect/Vector/VectorTransforms.h

Show First 20 Lines • Show All 166 Lines • ▼ Show 20 Lines	struct VectorTransferFullPartialRewriter : public RewritePattern {
LogicalResult matchAndRewrite(Operation *op,		LogicalResult matchAndRewrite(Operation *op,
PatternRewriter &rewriter) const override;		PatternRewriter &rewriter) const override;

private:		private:
VectorTransformsOptions options;		VectorTransformsOptions options;
FilterConstraintType filter;		FilterConstraintType filter;
};		};

		struct DistributeOps {
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions Why not just take a `Value` here ? Do you foresee actually creating the Value on the fly in the intended long-term usage? In the case that the `Value` is a loop induction variable this would look fishy. If it is just for testing purposes, we can write tests + passes that expect a certain structure and takes advantage of that. nicolasvasilache: Why not just take a `Value` here ? Do you foresee actually creating the Value on the fly in the…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions I was thinking we will need to create Value on the fly at some point which is why I had it that way. But I'm not sure this will be needed right now so I changed it to just pass a Value for now. ThomasRaoux: I was thinking we will need to create Value on the fly at some point which is why I had it that…
		ExtractMapOp extract;
		InsertMapOp insert;
		};

		/// Distribute a 1D vector pointwise operation over a range of given IDs taking
		/// all values in [0 .. multiplicity - 1] (e.g. loop induction variable or
		nicolasvasilacheUnsubmitted Done Reply Inline Actions `s/going from 0 to dimension/taking all values in [0 .. multiplicity - 1] (e.g. loop induction variable or SPMD id)` ? Not sure if this phrasing would be clear enough but I'd like to emphasize that this is a special type of value that must take all values in the range. Please us multiplicity here and below for consistency. nicolasvasilache: `s/going from 0 to dimension/taking all values in [0 .. multiplicity - 1] (e.g. loop…
		/// SPMD id). This transformation only inserts
		/// vector.extract_map/vector.insert_map. It is meant to be used with
		/// canonicalizations pattern to propagate and fold the vector
		/// insert_map/extract_map operations.
		/// Transforms:
		mravishankarUnsubmitted Not Done Reply Inline Actions Last comment here (sorry for the multiple rounds); return by reference is weird. In Linalg this was done by creating a struct like struct Foo { ExtractMapOp extract; InsertMapOp insert; } and return type of the method being `Optional<Foo>` . I am not aware of the convention here, but I find this better. Return by reference typically make sense for containers like vector, set, map, etc. to avoid a copy (though C++ semantics now is to do a move on return of such objects rather than a copy). mravishankar: Last comment here (sorry for the multiple rounds); return by reference is weird. In Linalg this…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions Changed it. ThomasRaoux: Changed it.
		// %v = addf %a, %b : vector<32xf32>
		/// to:
		/// %v = addf %a, %b : vector<32xf32> %ev =
		/// vector.extract_map %v, %id, 32 : vector<32xf32> into vector<1xf32> %nv =
		/// vector.insert_map %ev, %id, 32 : vector<1xf32> into vector<32xf32>
		Optional<DistributeOps> distributPointwiseVectorOp(OpBuilder &builder,
		Operation *op, Value id,
		int64_t multiplicity);
		/// Canonicalize an extra element using the result of a pointwise operation.
		/// Transforms:
		/// %v = addf %a, %b : vector32xf32>
		/// %dv = vector.extract_map %v, %id, 32 : vector<32xf32> into vector<1xf32>
		/// to:
		/// %da = vector.extract_map %a, %id, 32 : vector<32xf32> into vector<1xf32>
		/// %db = vector.extract_map %a, %id, 32 : vector<32xf32> into vector<1xf32>
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions This will create many patterns when instantiated on many possible pointwise vector ops. How about creating a `vector::PointwiseInterface` and adding it to proper operations ? Not necessary to do it in this PR but let's do it in the immediate next PR. nicolasvasilache: This will create many patterns when instantiated on many possible pointwise vector ops. How…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions I went for solution 1. Let me know if this is different than what you had in mind. ThomasRaoux: I went for solution 1. Let me know if this is different than what you had in mind.
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions I was thinking of structuring this differently: DistributeVectorPattern<Op> only adds a pair of extract_map, insert_map after the op. then canonicalization patterns quick-in greedily: extract_map "go up" until they fold into a transfer_read; insert_map "go down" until they fold into a transfer_write. Triggering the transformation can be done in a few ways: either have a transform / pass explicitly call distribution to insert the pair of ops and later apply the canonicalization patterns (in that case the match failure condition is when a vector op flows into an extract_map op). add additional filtering constraints so that the patterns applies only in certain conditions: e.g. based on type (vector op of size "n", presence of an attribute, keep a set of "seen" ops in the pattern). traverse the use-def chains until we see a use of the appropriate `%id`, in which case bail. However, this is likely to become expensive and attribute/seen op from 2. is a memoization of this. The advantage is we only need to implement 2 canonicalization patterns (one for pull extract up through an op, the other for pushing insert down through an op) instead of 3 (insert + extract + rewrite op as "extract-op-insert"). This will be quite simpler once we have to start dealing with permutation / indexing maps. nicolasvasilache: I was thinking of structuring this differently: 1. DistributeVectorPattern<Op> only adds a…
		/// %dv = addf %da, %db : vector<1xf32>
		struct PointwiseExtractPattern : public OpRewritePattern<ExtractMapOp> {
		using FilterConstraintType = std::function<LogicalResult(ExtractMapOp op)>;
		PointwiseExtractPattern(
		MLIRContext *context, FilterConstraintType constraint =
		[](ExtractMapOp op) { return success(); })
		: OpRewritePattern<ExtractMapOp>(context), filter(constraint) {}
		LogicalResult matchAndRewrite(ExtractMapOp extract,
		PatternRewriter &rewriter) const override;

		private:
		FilterConstraintType filter;
		};

} // namespace vector		} // namespace vector

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Finer-grained patterns exposed for more control over individual lowerings.		// Finer-grained patterns exposed for more control over individual lowerings.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Progressive lowering of a `vector.contract %a, %b, %c` with row-major matmul		/// Progressive lowering of a `vector.contract %a, %b, %c` with row-major matmul
/// semantics to:		/// semantics to:
▲ Show 20 Lines • Show All 171 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/VectorOps.cpp

Show First 20 Lines • Show All 895 Lines • ▼ Show 20 Lines	void ExtractSlicesOp::getSizes(SmallVectorImpl<int64_t> &results) {
populateFromInt64AttrArray(sizes(), results);		populateFromInt64AttrArray(sizes(), results);
}		}

void ExtractSlicesOp::getStrides(SmallVectorImpl<int64_t> &results) {		void ExtractSlicesOp::getStrides(SmallVectorImpl<int64_t> &results) {
populateFromInt64AttrArray(strides(), results);		populateFromInt64AttrArray(strides(), results);
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
		// ExtractMapOp
		//===----------------------------------------------------------------------===//

		void ExtractMapOp::build(OpBuilder &builder, OperationState &result,
		Value vector, Value id, int64_t multiplicity) {
		VectorType type = vector.getType().cast<VectorType>();
		VectorType resultType = VectorType::get(type.getNumElements() / multiplicity,
		type.getElementType());
		ExtractMapOp::build(builder, result, resultType, vector, id, multiplicity);
		}

		static LogicalResult verify(ExtractMapOp op) {
		if (op.getSourceVectorType().getShape().size() != 1 \|\|
		op.getResultType().getShape().size() != 1)
		return op.emitOpError("expects source and destination vectors of rank 1");
		if (op.getResultType().getNumElements() * (int64_t)op.multiplicity() !=
		op.getSourceVectorType().getNumElements())
		return op.emitOpError("vector sizes mismatch. Source size must be equal "
		"to destination size * multiplicity");
		return success();
		}

		//===----------------------------------------------------------------------===//
// BroadcastOp		// BroadcastOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

static LogicalResult verify(BroadcastOp op) {		static LogicalResult verify(BroadcastOp op) {
VectorType srcVectorType = op.getSourceType().dyn_cast<VectorType>();		VectorType srcVectorType = op.getSourceType().dyn_cast<VectorType>();
VectorType dstVectorType = op.getVectorType();		VectorType dstVectorType = op.getVectorType();
// Scalar to vector broadcast is always valid. A vector		// Scalar to vector broadcast is always valid. A vector
// to vector broadcast needs some additional checking.		// to vector broadcast needs some additional checking.
▲ Show 20 Lines • Show All 206 Lines • ▼ Show 20 Lines	void InsertSlicesOp::getSizes(SmallVectorImpl<int64_t> &results) {
populateFromInt64AttrArray(sizes(), results);		populateFromInt64AttrArray(sizes(), results);
}		}

void InsertSlicesOp::getStrides(SmallVectorImpl<int64_t> &results) {		void InsertSlicesOp::getStrides(SmallVectorImpl<int64_t> &results) {
populateFromInt64AttrArray(strides(), results);		populateFromInt64AttrArray(strides(), results);
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
		// InsertMapOp
		//===----------------------------------------------------------------------===//

		void InsertMapOp::build(OpBuilder &builder, OperationState &result,
		Value vector, Value id, int64_t multiplicity) {
		VectorType type = vector.getType().cast<VectorType>();
		VectorType resultType = VectorType::get(type.getNumElements() * multiplicity,
		type.getElementType());
		InsertMapOp::build(builder, result, resultType, vector, id, multiplicity);
		}

		static LogicalResult verify(InsertMapOp op) {
		if (op.getSourceVectorType().getShape().size() != 1 \|\|
		op.getResultType().getShape().size() != 1)
		return op.emitOpError("expected source and destination vectors of rank 1");
		if ((int64_t)op.multiplicity() * op.getSourceVectorType().getNumElements() !=
		op.getResultType().getNumElements())
		return op.emitOpError(
		"vector sizes mismatch. Destination size must be equal "
		"to source size * multiplicity");
		return success();
		}

		//===----------------------------------------------------------------------===//
// InsertStridedSliceOp		// InsertStridedSliceOp
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

void InsertStridedSliceOp::build(OpBuilder &builder, OperationState &result,		void InsertStridedSliceOp::build(OpBuilder &builder, OperationState &result,
Value source, Value dest,		Value source, Value dest,
ArrayRef<int64_t> offsets,		ArrayRef<int64_t> offsets,
ArrayRef<int64_t> strides) {		ArrayRef<int64_t> strides) {
result.addOperands({source, dest});		result.addOperands({source, dest});
▲ Show 20 Lines • Show All 1,569 Lines • Show Last 20 Lines

mlir/lib/Dialect/Vector/VectorTransforms.cpp

Show First 20 Lines • Show All 2,412 Lines • ▼ Show 20 Lines	LogicalResult mlir::vector::VectorTransferFullPartialRewriter::matchAndRewrite(
if (succeeded(splitFullAndPartialTransfer(rewriter, xferOp, options))) {		if (succeeded(splitFullAndPartialTransfer(rewriter, xferOp, options))) {
rewriter.finalizeRootUpdate(xferOp);		rewriter.finalizeRootUpdate(xferOp);
return success();		return success();
}		}
rewriter.cancelRootUpdate(xferOp);		rewriter.cancelRootUpdate(xferOp);
return failure();		return failure();
}		}

		LogicalResult mlir::vector::PointwiseExtractPattern::matchAndRewrite(
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Can we turn this into a canonicalization pattern on ExtractMapOp? nicolasvasilache: Can we turn this into a canonicalization pattern on ExtractMapOp?
		ExtractMapOp extract, PatternRewriter &rewriter) const {
		Operation *definedOp = extract.vector().getDefiningOp();
		if (!definedOp \|\| definedOp->getNumResults() != 1)
		return failure();
		// TODO: Create an interfaceOp for elementwise operations.
		if (!isa<AddFOp>(definedOp))
		return failure();
		Location loc = extract.getLoc();
		SmallVector<Value, 4> extractOperands;
		for (OpOperand &operand : definedOp->getOpOperands())
		extractOperands.push_back(rewriter.create<vector::ExtractMapOp>(
		loc, operand.get(), extract.id(), extract.multiplicity()));
		Operation *newOp = cloneOpWithOperandsAndTypes(
		rewriter, loc, definedOp, extractOperands, extract.getResult().getType());
		rewriter.replaceOp(extract, newOp->getResult(0));
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions I don't think this is necessary, see my general comment above. nicolasvasilache: I don't think this is necessary, see my general comment above.
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions Based on your general comment, the pattern is gone and I only left the transformation that can be called directly. ThomasRaoux: Based on your general comment, the pattern is gone and I only left the transformation that can…
		return success();
		}

		Optional<mlir::vector::DistributeOps>
		mlir::vector::distributPointwiseVectorOp(OpBuilder &builder, Operation *op,
		Value id, int64_t multiplicity) {
		OpBuilder::InsertionGuard guard(builder);
		builder.setInsertionPointAfter(op);
		Location loc = op->getLoc();
		Value result = op->getResult(0);
		DistributeOps ops;
		ops.extract =
		builder.create<vector::ExtractMapOp>(loc, result, id, multiplicity);
		ops.insert =
		builder.create<vector::InsertMapOp>(loc, ops.extract, id, multiplicity);
		return ops;
		}
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions You shouldn't have this called by the rewrite pattern. Code should be restructured to avoid this. nicolasvasilache: You shouldn't have this called by the rewrite pattern. Code should be restructured to avoid…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions Removed this pattern based on other comments. ThomasRaoux: Removed this pattern based on other comments.
		mravishankarUnsubmitted Not Done Reply Inline Actions For such methods it is better to just return `extract` and `newVec` and let the caller deal with the operations. Similarly it is better to pass in the `OpBuilder` as an argument. If used within the `DialectConversion` pass the `ConversionPatternRewriter` needs to track operations added/deleted so that it can undo on failure. mravishankar: For such methods it is better to just return `extract` and `newVec` and let the caller deal…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions Sure, I can change that. I'll wait for your second comment to be resolved first as this will work differently based on whether I use a pattern or do direct transformation. ThomasRaoux: Sure, I can change that. I'll wait for your second comment to be resolved first as this will…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions I changed the signature to return the extract/insert and do the replace outside the function. ThomasRaoux: I changed the signature to return the extract/insert and do the replace outside the function.

// TODO: Add pattern to rewrite ExtractSlices(ConstantMaskOp).		// TODO: Add pattern to rewrite ExtractSlices(ConstantMaskOp).
// TODO: Add this as DRR pattern.		// TODO: Add this as DRR pattern.
void mlir::vector::populateVectorToVectorTransformationPatterns(		void mlir::vector::populateVectorToVectorTransformationPatterns(
OwningRewritePatternList &patterns, MLIRContext *context) {		OwningRewritePatternList &patterns, MLIRContext *context) {
// clang-format off		// clang-format off
patterns.insert<ShapeCastOpDecomposer,		patterns.insert<ShapeCastOpDecomposer,
ShapeCastOpFolder,		ShapeCastOpFolder,
SplitTransferReadOp,		SplitTransferReadOp,
Show All 27 Lines

mlir/test/Dialect/Vector/invalid.mlir

	Show First 20 Lines • Show All 1,322 Lines • ▼ Show 20 Lines
	}			}

	// -----			// -----

	func @compress_dim_mask_mismatch(%base: memref<?xf32>, %mask: vector<17xi1>, %value: vector<16xf32>) {			func @compress_dim_mask_mismatch(%base: memref<?xf32>, %mask: vector<17xi1>, %value: vector<16xf32>) {
	// expected-error@+1 {{'vector.compressstore' op expected value dim to match mask dim}}			// expected-error@+1 {{'vector.compressstore' op expected value dim to match mask dim}}
	vector.compressstore %base, %mask, %value : memref<?xf32>, vector<17xi1>, vector<16xf32>			vector.compressstore %base, %mask, %value : memref<?xf32>, vector<17xi1>, vector<16xf32>
	}			}

				// -----

				func @extract_map_rank(%v: vector<2x32xf32>, %id : index) {
				// expected-error@+1 {{'vector.extract_map' op expects source and destination vectors of rank 1}}
				%0 = vector.extract_map %v[%id : 32] : vector<2x32xf32> to vector<2x1xf32>
				}

				// -----

				func @extract_map_size(%v: vector<63xf32>, %id : index) {
				// expected-error@+1 {{'vector.extract_map' op vector sizes mismatch. Source size must be equal to destination size * multiplicity}}
				%0 = vector.extract_map %v[%id : 32] : vector<63xf32> to vector<2xf32>
				}

				// -----

				func @insert_map_rank(%v: vector<2x1xf32>, %id : index) {
				// expected-error@+1 {{'vector.insert_map' op expected source and destination vectors of rank 1}}
				%0 = vector.insert_map %v, %id, 32 : vector<2x1xf32> to vector<2x32xf32>
				}

				// -----

				func @insert_map_size(%v: vector<1xf32>, %id : index) {
				// expected-error@+1 {{'vector.insert_map' op vector sizes mismatch. Destination size must be equal to source size * multiplicity}}
				%0 = vector.insert_map %v, %id, 32 : vector<1xf32> to vector<64xf32>
				}

mlir/test/Dialect/Vector/ops.mlir

	Show First 20 Lines • Show All 426 Lines • ▼ Show 20 Lines
	// CHECK-LABEL: @expand_and_compress			// CHECK-LABEL: @expand_and_compress
	func @expand_and_compress(%base: memref<?xf32>, %mask: vector<16xi1>, %passthru: vector<16xf32>) {			func @expand_and_compress(%base: memref<?xf32>, %mask: vector<16xi1>, %passthru: vector<16xf32>) {
	// CHECK: %[[X:.]] = vector.expandload %{{.}}, %{{.}}, %{{.}} : memref<?xf32>, vector<16xi1>, vector<16xf32> into vector<16xf32>			// CHECK: %[[X:.]] = vector.expandload %{{.}}, %{{.}}, %{{.}} : memref<?xf32>, vector<16xi1>, vector<16xf32> into vector<16xf32>
	%0 = vector.expandload %base, %mask, %passthru : memref<?xf32>, vector<16xi1>, vector<16xf32> into vector<16xf32>			%0 = vector.expandload %base, %mask, %passthru : memref<?xf32>, vector<16xi1>, vector<16xf32> into vector<16xf32>
	// CHECK: vector.compressstore %{{.}}, %{{.}}, %[[X]] : memref<?xf32>, vector<16xi1>, vector<16xf32>			// CHECK: vector.compressstore %{{.}}, %{{.}}, %[[X]] : memref<?xf32>, vector<16xi1>, vector<16xf32>
	vector.compressstore %base, %mask, %0 : memref<?xf32>, vector<16xi1>, vector<16xf32>			vector.compressstore %base, %mask, %0 : memref<?xf32>, vector<16xi1>, vector<16xf32>
	return			return
	}			}

				// CHECK-LABEL: @extract_insert_map
				func @extract_insert_map(%v: vector<32xf32>, %id : index) -> vector<32xf32> {
				// CHECK: %[[V:.]] = vector.extract_map %{{.}}[%{{.*}} : 16] : vector<32xf32> to vector<2xf32>
				%vd = vector.extract_map %v[%id : 16] : vector<32xf32> to vector<2xf32>
				// CHECK: %[[R:.]] = vector.insert_map %[[V]], %{{.}}, 16 : vector<2xf32> to vector<32xf32>
				%r = vector.insert_map %vd, %id, 16 : vector<2xf32> to vector<32xf32>
				// CHECK: return %[[R]] : vector<32xf32>
				return %r : vector<32xf32>
				}

mlir/test/Dialect/Vector/vector-distribution.mlir

This file was added.

				// RUN: mlir-opt %s -test-vector-distribute-patterns \| FileCheck %s

				// CHECK-LABEL: func @distribute_vector_add
				// CHECK-SAME: (%[[ID:.*]]: index
				// CHECK-NEXT: %[[EXA:.]] = vector.extract_map %{{.}}[%[[ID]] : 32] : vector<32xf32> to vector<1xf32>
				// CHECK-NEXT: %[[EXB:.]] = vector.extract_map %{{.}}[%[[ID]] : 32] : vector<32xf32> to vector<1xf32>
				// CHECK-NEXT: %[[ADD:.*]] = addf %[[EXA]], %[[EXB]] : vector<1xf32>
				// CHECK-NEXT: %[[INS:.*]] = vector.insert_map %[[ADD]], %[[ID]], 32 : vector<1xf32> to vector<32xf32>
				// CHECK-NEXT: return %[[INS]] : vector<32xf32>
				func @distribute_vector_add(%id : index, %A: vector<32xf32>, %B: vector<32xf32>) -> vector<32xf32> {
				%0 = addf %A, %B : vector<32xf32>
				return %0: vector<32xf32>
				}

mlir/test/lib/Transforms/TestVectorTransforms.cpp

Show First 20 Lines • Show All 119 Lines • ▼ Show 20 Lines	void runOnFunction() override {
patterns.insert<UnrollVectorPattern<vector::ContractionOp>>(		patterns.insert<UnrollVectorPattern<vector::ContractionOp>>(
ArrayRef<int64_t>{2, 2, 2}, ctx);		ArrayRef<int64_t>{2, 2, 2}, ctx);
populateVectorToVectorCanonicalizationPatterns(patterns, ctx);		populateVectorToVectorCanonicalizationPatterns(patterns, ctx);
populateVectorToVectorTransformationPatterns(patterns, ctx);		populateVectorToVectorTransformationPatterns(patterns, ctx);
applyPatternsAndFoldGreedily(getFunction(), patterns);		applyPatternsAndFoldGreedily(getFunction(), patterns);
}		}
};		};

		struct TestVectorDistributePatterns
		: public PassWrapper<TestVectorDistributePatterns, FunctionPass> {
		void getDependentDialects(DialectRegistry &registry) const override {
		registry.insert<VectorDialect>();
		}
		void runOnFunction() override {
		MLIRContext *ctx = &getContext();
		OwningRewritePatternList patterns;
		FuncOp func = getFunction();
		func.walk([&](AddFOp op) {
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions I would restructure this test pass to just take the first index of the enclosing function and use that as the value to distribute on. I would use a hardcoded custom op "test_distribution_value" that just takes an index. In the test IR you can then just have a func argument `%arg0 : index` and do `test_distribution_value(%arg0)`. In the future, we can plug that to a loop induction variable and write an integration test. nicolasvasilache: I would restructure this test pass to just take the first index of the enclosing function and…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions Ok, changed it to do that. ThomasRaoux: Ok, changed it to do that.
		OpBuilder builder(op);
		mravishankarUnsubmitted Not Done Reply Inline Actions I think if there were an Operation interface for all relevant ops, then this could be done as a pattern which would be highly preferable. mravishankar: I think if there were an Operation interface for all relevant ops, then this could be done as a…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions The reason I'm not using a Pattern is not because there no Operation interface but because the rewrite pattern would run into infinite loop since this transformation leaves the original instruction unchanged and just adds extract_map/insert_map that then get propagated by the canonicalization operations. (See discussion with Nicolas about that: https://reviews.llvm.org/D88341#inline-819945) Here I went for the solution 1. suggested by Nicolas. Using solution 2. would allow making it a pattern however I'm not sure if it is really cleaner. What do you think? ThomasRaoux: The reason I'm not using a Pattern is not because there no Operation interface but because the…
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions The pattern is the one that propagates the insert/extract up and down and will benefit from the Interfaces. It seems to me the addition of insert/extract pairs is the responsibility of the pass/transformation which seems appropriate for the current state of the test pass. I'd say we can revisit this later when we have enough patterns and we start looking at profitability. nicolasvasilache: The pattern is the one that propagates the insert/extract up and down and will benefit from the…
		ThomasRaouxAuthorUnsubmitted Done Reply Inline Actions Leaving it like that for now. ThomasRaoux: Leaving it like that for now.
		Optional<mlir::vector::DistributeOps> ops = distributPointwiseVectorOp(
		builder, op.getOperation(), func.getArgument(0), 32);
		assert(ops.hasValue());
		SmallPtrSet<Operation *, 1> extractOp({ops->extract});
		op.getResult().replaceAllUsesExcept(ops->insert.getResult(), extractOp);
		});
		patterns.insert<PointwiseExtractPattern>(ctx);
		applyPatternsAndFoldGreedily(getFunction(), patterns);
		}
		};

struct TestVectorTransferFullPartialSplitPatterns		struct TestVectorTransferFullPartialSplitPatterns
: public PassWrapper<TestVectorTransferFullPartialSplitPatterns,		: public PassWrapper<TestVectorTransferFullPartialSplitPatterns,
FunctionPass> {		FunctionPass> {
TestVectorTransferFullPartialSplitPatterns() = default;		TestVectorTransferFullPartialSplitPatterns() = default;
TestVectorTransferFullPartialSplitPatterns(		TestVectorTransferFullPartialSplitPatterns(
const TestVectorTransferFullPartialSplitPatterns &pass) {}		const TestVectorTransferFullPartialSplitPatterns &pass) {}

void getDependentDialects(DialectRegistry &registry) const override {		void getDependentDialects(DialectRegistry &registry) const override {
Show All 37 Lines	void registerTestVectorConversions() {
PassRegistration<TestVectorUnrollingPatterns> contractionUnrollingPass(		PassRegistration<TestVectorUnrollingPatterns> contractionUnrollingPass(
"test-vector-unrolling-patterns",		"test-vector-unrolling-patterns",
"Test conversion patterns to unroll contract ops in the vector dialect");		"Test conversion patterns to unroll contract ops in the vector dialect");

PassRegistration<TestVectorTransferFullPartialSplitPatterns>		PassRegistration<TestVectorTransferFullPartialSplitPatterns>
vectorTransformFullPartialPass("test-vector-transfer-full-partial-split",		vectorTransformFullPartialPass("test-vector-transfer-full-partial-split",
"Test conversion patterns to split "		"Test conversion patterns to split "
"transfer ops via scf.if + linalg ops");		"transfer ops via scf.if + linalg ops");
		PassRegistration<TestVectorDistributePatterns> distributePass(
		"test-vector-distribute-patterns",
		"Test conversion patterns to distribute vector ops in the vector "
		"dialect");
}		}
} // namespace mlir		} // namespace mlir

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][vector] First step of vector distribution transformationClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 295391

mlir/include/mlir/Dialect/Vector/VectorOps.td

mlir/include/mlir/Dialect/Vector/VectorTransforms.h

mlir/lib/Dialect/Vector/VectorOps.cpp

mlir/lib/Dialect/Vector/VectorTransforms.cpp

mlir/test/Dialect/Vector/invalid.mlir

mlir/test/Dialect/Vector/ops.mlir

mlir/test/Dialect/Vector/vector-distribution.mlir

mlir/test/lib/Transforms/TestVectorTransforms.cpp

[mlir][vector] First step of vector distribution transformation
ClosedPublic