Page MenuHomePhabricator

[mlir] Support masks in TransferOpReduceRank and TransferReadPermutationLowering

Authored by springerm on May 11 2021, 1:32 AM.



These two patterns allow for more efficient codegen in VectorToSCF.

Depends On D101745

Diff Detail

Event Timeline

springerm created this revision.May 11 2021, 1:32 AM
springerm requested review of this revision.May 11 2021, 1:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2021, 1:32 AM
springerm added inline comments.May 11 2021, 1:34 AM

I don't fully understand what this is doing. If there's a better way to implement the mask handling below (maybe reusing some of these functions), please let me know.

springerm updated this revision to Diff 344318.May 11 2021, 1:38 AM

changed comment

springerm updated this revision to Diff 344322.May 11 2021, 1:40 AM

minor change

ThomasRaoux requested changes to this revision.May 11 2021, 7:58 AM
ThomasRaoux added inline comments.

You should be able to directly permutation for the transpose. (you'll have to convert it to int64_t unfortunately as affineMap take unsigned but transpose take int64_t, this is something we should fix)

What this isPermutationOfMinorIdentityWithBroadcasting does is it picks a permutation to get to minor identity with broadcast. If there is a broadcast in the map there may be several potential permutation but the right thing to do is to apply the same transpose to masks and dimensions.

In the code we inverse the permutation map to apply it to the result of the transfer read as we want to convert it back to the original order but for masks you just want to directly apply the permutation.


Don't we need to remove the leading dimension of the mask so that it matches the new rank of the transfer read? We should also add a test for this case.

This revision now requires changes to proceed.May 11 2021, 7:58 AM
springerm added inline comments.May 11 2021, 5:20 PM

Thanks for explaining.

I cannot use permutation directly, because mask ignores all broadcast dimensions. E.g., a 4D transfer read op would have a 1D mask if 3 dimensions are broadcasted. Therefore, I would somehow have to remove source dimensions from permutation and "compress" (re-index) the map.

Note, I added support for transfer op masks recently, and the commit handling broadcasts (D101745) is not submitted yet. So if the "vector type shape != mask type shape" thing is surprising, this is something we could still discuss in that commit.


Since broadcast dims do not have a corresponding dimension in the mask vector, removing a broadcast does not require any changes to the mask vector.

ThomasRaoux added inline comments.May 11 2021, 8:59 PM

I see, I missed the part that mask skips the broadcast dimension. In this case it makes sense, I added some potential simplification below.


Can you skip all this by doing compressUnusedDims(map)? this will remove all the unused dimension then you can just loop through all the results and do

for (unsigned i = 0; i < map.getNumResults(); ++i) {
  if (auto expr = map.getResult(i).dyn_cast<AffineDimExpr>())

should be cast<> instead of dyn_cast since you already know the type.


Makes sense.

nicolasvasilache accepted this revision.May 12 2021, 3:39 AM

Same as previous PRs, please add some minimal IR test.
Approving conditioned on that.



address comments

springerm marked 5 inline comments as done.May 12 2021, 11:06 PM

Test case is included in mlir/test/Dialect/Vector/vector-transfer-lowering.mlir.


Nice, I didn't know that there's actually a compressUnusedDims helper function.

This revision was not accepted when it landed; it landed in state Needs Review.May 12 2021, 11:16 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.