This is an archive of the discontinued LLVM Phabricator instance.

[mlir][memref] Implement fast lowering of memref.copy
ClosedPublic

Authored by herhut on Dec 21 2021, 4:00 AM.

Details

Summary

In the absence of maps, we can lower memref.copy to a memcpy.

Diff Detail

Event Timeline

herhut created this revision.Dec 21 2021, 4:00 AM
herhut requested review of this revision.Dec 21 2021, 4:00 AM

Do you want to instead create one pattern with two functions in it? It'll lead to less overhead in the greedy rewrite driver and perhaps also easier to later choose between the two.

mlir/lib/Conversion/MemRefToLLVM/MemRefToLLVM.cpp
825

Doc comment here - along the lines of your revision summary and title.

mehdi_amini added inline comments.Dec 21 2021, 11:49 AM
mlir/lib/Conversion/MemRefToLLVM/MemRefToLLVM.cpp
763

This should likely be documented, I think notifyFailure would do it here.

herhut updated this revision to Diff 395851.Dec 22 2021, 5:21 AM

Add comments and failure reason.

herhut marked 2 inline comments as done.Dec 22 2021, 5:24 AM

Do you want to instead create one pattern with two functions in it? It'll lead to less overhead in the greedy rewrite driver and perhaps also easier to later choose between the two.

I though about that, too. Another way would be to give the memcpy based pattern higher benefit and actually make then non-exclusive. That way one could optionally blend in the memcpy based pattern.

One issue I currently see with this entire thing is that it relies on maps being present as a way to understand whether the memref has identity strides and offset. I am not sure whether this assumption actually holds for all users.

Do you want to instead create one pattern with two functions in it? It'll lead to less overhead in the greedy rewrite driver and perhaps also easier to later choose between the two.

I though about that, too. Another way would be to give the memcpy based pattern higher benefit and actually make then non-exclusive. That way one could optionally blend in the memcpy based pattern.

All of this still leads to two patterns.

One issue I currently see with this entire thing is that it relies on maps being present as a way to understand whether the memref has identity strides and offset. I am not sure whether this assumption actually holds for all users.

You are just using isIdentity() -- the API doesn't name map outside. In the inside, it's still a map.

mlir/lib/Conversion/MemRefToLLVM/MemRefToLLVM.cpp
716–717

non-identity maps -> non-identity layout

There is no reference to map in the API and in your code below and there isn't a need to. The MemRefLayoutAttrInterface hides it.

Do you want to instead create one pattern with two functions in it? It'll lead to less overhead in the greedy rewrite driver and perhaps also easier to later choose between the two.

I though about that, too. Another way would be to give the memcpy based pattern higher benefit and actually make then non-exclusive. That way one could optionally blend in the memcpy based pattern.

All of this still leads to two patterns.

Yes, my question is more whether that is useful, to have it as two patterns vs. one. The benefit modelling would be one reason. If everybody thinks that this is not useful enough for the cost, I can merge them.

One issue I currently see with this entire thing is that it relies on maps being present as a way to understand whether the memref has identity strides and offset. I am not sure whether this assumption actually holds for all users.

You are just using isIdentity() -- the API doesn't name map outside. In the inside, it's still a map.

I was not clear. I was wondering about the case where the static type of the memref does not have a map but the descriptor at runtime still does not have identity strides. Currently, one can create this setting using memref.reinterpret_cast by omitting the map from the target type. Maybe that should be illegal.

You are just using isIdentity() -- the API doesn't name map outside. In the inside, it's still a map.

I was not clear. I was wondering about the case where the static type of the memref does not have a map but the descriptor at runtime still does not have identity strides. Currently, one can create this setting using memref.reinterpret_cast by omitting the map from the target type. Maybe that should be illegal.

This would be an invalid op and should have failed the verifier in the first place. (Note that the memref always has a map -- it's not printed if it's identity as you know. Also, identity strides (all ones) don't correspond to an identity map -- N^2, N, 1 for example would correspond to an identity map for a 3-d memref for example.)

herhut updated this revision to Diff 398072.Jan 7 2022, 1:01 AM

Combined the two patterns.

herhut added a comment.Jan 7 2022, 1:04 AM

You are just using isIdentity() -- the API doesn't name map outside. In the inside, it's still a map.

I was not clear. I was wondering about the case where the static type of the memref does not have a map but the descriptor at runtime still does not have identity strides. Currently, one can create this setting using memref.reinterpret_cast by omitting the map from the target type. Maybe that should be illegal.

This would be an invalid op and should have failed the verifier in the first place. (Note that the memref always has a map -- it's not printed if it's identity as you know. Also, identity strides (all ones) don't correspond to an identity map -- N^2, N, 1 for example would correspond to an identity map for a 3-d memref for example.)

I have sent https://reviews.llvm.org/D116601 to improve verification. One can still create this case dynamically but as we consider that illegal, the result of memref.copy is undefined in such cases, so it is fine to just use the intrinsics.

bondhugula accepted this revision.Jan 7 2022, 7:17 AM

LGTM - thanks!

This revision is now accepted and ready to land.Jan 7 2022, 7:17 AM
herhut updated this revision to Diff 399613.Jan 13 2022, 3:06 AM

fix test

This revision was automatically updated to reflect the committed changes.