This is an archive of the discontinued LLVM Phabricator instance.

Adding a named op for grouped convolutions
AcceptedPublic

Authored by gpetters94 on Jun 7 2022, 9:12 AM.

Details

Summary

This is very much a WIP but I'd like to get feedback on if this is the correct way to go about implementing this. As per Sean Silva's advice, this uses the group size as a second batch dimension in both input and weight.

Diff Detail

Event Timeline

gpetters94 created this revision.Jun 7 2022, 9:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2022, 9:12 AM
gpetters94 requested review of this revision.Jun 7 2022, 9:12 AM

I suspect the g dimension should be contiguous with the C dimension, and I'm not sure if it should be major or minor to it.

I think it makes more sense for it to be major, if only for legibility's sake.

mravishankar requested changes to this revision.Jun 7 2022, 9:20 PM
mravishankar added inline comments.
mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py
364

mods and divs in indexing map is going to be problematic..... I'd expect input shape to be [NG][N][CG][...][..] where NG is assumed to be C/G and CG is assumed to be c%G. We cannot enforce those conditions, they are expected to be true....

This revision now requires changes to proceed.Jun 7 2022, 9:20 PM

I think it makes more sense for it to be major, if only for legibility's sake.

It's not really something we get to choose -- the op definition in frontends determines whether the groups are major or minor to the "real channels" -- I think it is major but please verify that. If we don't get this right then we have to transpose before calling this op which is wasteful.

I think it makes more sense for it to be major, if only for legibility's sake.

It's not really something we get to choose -- the op definition in frontends determines whether the groups are major or minor to the "real channels" -- I think it is major but please verify that. If we don't get this right then we have to transpose before calling this op which is wasteful.

It looks like PyTorch in the slow CPU backend implements this with batches of convolutions rather than a batch dimension. After some digging through mkldnn, it looks like group is major there, although that's opaque to the user. So for PyTorch at least it looks like [N, G, C, ...] is the way to go.

Addressed comments.

gpetters94 marked an inline comment as done.Jun 9 2022, 11:38 PM

Updated indexing maps.

mravishankar accepted this revision.Jun 13 2022, 8:47 PM
This revision is now accepted and ready to land.Jun 13 2022, 8:47 PM

Finally finished the bugfixing, with a lot of help from Mahesh. Should be ready to be merged unless anyone has any more requests.