Thanks for addressing the large number of comments. Some additional minor ones and one that was missed (or not pushed). This overall looks great to me!
Feb 22 2021
Hi, Thanks for the comments.
A high-level design question: why does the element type of mmafragment have to be a vector type? I'd just use 2D indexing for the fragment, it's not like we are going to extract vectors from it.
I have tried to keep the types as close to what is expected by the corresponding LLVM intrinsics. As the mma.compute intrinsic expects operands in <2 x half> form and also returns things in a similar form, I have used the vector type.
This is an anti-argument for me, I see very little value in just lifting the low-level LLVM abstractions to higher levels. Hardcoding NVVM-specific modeling in the GPU dialect that is
supposed to abstract that away defies the purpose of the GPU dialect. It sounds like memfragment<AxBxf16> would make all of the code, except for a tiny part of the conversion, simpler.
If you have fixed for the iter_args handling, please also update the commit summary to reflect that.
We should prevent loops with iter_args to be fused
Hi Diego, do you prefer to have it in this patch? I could do it by simply checking the source and dest affine.for, if one of them uses iter_args (or returns a SSA value), we prevent them to be fused.
Feb 20 2021
Some comments on clarifying doc / code comments.
Thanks for the patch! There is a different but somehow related issue that is being address here: https://reviews.llvm.org/D97032.
I would like to know what @bondhugula and @andydavis1 think but I think we might want to add the proper %0 -> %i1 dependence to the MDG,
instead of special-casing the load scenario. Even though it's not strictly a memref dependence, it's a memref-related dependence between two graph nodes.
Nit: ... also be made to be MemRefsNormalizable -> should have the MemRefsNormalizable trait.
Feb 13 2021
LGTM, thanks. Please do resolve the remaining comments.
Feb 11 2021
Looks fine for the minor comments part I had.
Feb 10 2021
Please also make the commit title more descriptive.
Thanks for making the "memref deferencing ops" thing more systematic here. This is looking good to me. Mostly polishing and doc related comments.
Mostly minor comments to address.
Feb 9 2021
Address review comment on auto.
Rebase on main tip. Minor doc comment and commit summary update.
Feb 8 2021
A few more minor comments. Please also address the commit summary comment above. This functionality is looking great to me. Please do go with Alex's deeper review comments.
This op will have to be moved to the right dialect once the std dialect split completes - mostly scf.
Feb 7 2021
Please remove RFC from the commit title. Also,
"MemRef memory space as Attribute" -> "Model MemRef memory space as Attribute" or "Switch MemRef memory space: unsigned -> Attribute".
Feb 6 2021
Can you please add a commit summary as well - "manipulate ranked strided memref" doesn't capture all of this support accurately.
Feb 5 2021
Thanks for improving this. Let me know if you want me to commit this for you.
Feb 4 2021
"inherent" (or "core") and "external" are looking good to me. Thanks!
Can you use mma_fragment instead of mmafragment for better readability?
Feb 2 2021
Jan 30 2021
Thanks for noticing this issue. A couple of comments.
Jan 26 2021
This seems to me like a 'tip of the iceberg' kind of solution. What happens if the attribute needs to be transformed? What happens if there is not a 1:1 correspondance between input operations and output operations in a transformation? etc. etc. I'm not sure what the right solution is, but this seems to be a narrowly scoped infrastructure solution to a tensorflow problem. I'd be much happier if this solved other obvious problems.
Jan 25 2021
Jan 23 2021
Jan 20 2021
Jan 19 2021
This looks really great! A bunch of minor comments to address.
Jan 6 2021
Can you please add a proper commit summary with the rationale (basically whatever is in the discussion now)? Also, as @jpienaar points out, the commit title isn't really accurate.
Jan 1 2021
Missing a test case?
Dec 31 2020
Thanks very much for fixing this!
Dec 24 2020
Dec 23 2020
@ThomasRaoux, not on a directly related note, but this file VectorOps.cpp is taking nearly 12-15s to build on a fast workstation -- directly increasing build times / critical path towards mlir-opt and other targets for everyone. It'd be great to consider refactoring among next steps.
Dec 22 2020
Dec 21 2020
Dec 19 2020
Mostly minor comments.
Dec 11 2020
Dec 10 2020
Thanks for this significant update! Some superficial comments to start with. It may be good to handle the pass documentation update as well, starting with an update to the summary. It's currently emptly: https://mlir.llvm.org/docs/Passes/#-affine-loop-fusion-fuse-affine-loop-nests
You could mention about the fusion strategies producer/consumer and sibling ones as well. Here's the previous doc paragraph that disappeared (was overwritten) when pass documentation was migrated to auto-generated stuff.