- User Since
- Mar 14 2022, 9:59 PM (54 w, 2 d)
Fri, Mar 17
Thanks so much! I'm not sure how I didn't catch this sooner. Change LGTM. I'll send an updated integration test for the other diff.
Jan 17 2023
I think GPU function support for EmitC may be better off as a GPUToEmitC conversion pass where gpu.func gets converted to func.func. This would subsume the pass you added as well as the changes to the emitter.
Jan 13 2023
Fix commit message.
Jan 10 2023
Fix trailing whitespace.
Add some tests for the standalone "gpu-lower-memory-space-attributes" pass.
Jan 6 2023
Jan 5 2023
F32 needs to be extended to f64. If you add a test you will see it fails.
Address all comments
Dec 23 2022
The code for the pass that is introduced here is being exercised in the gpu-to-llvm conversions, but I still need to add tests for the pass in isolation.
Nov 30 2022
Can you add some more detail to the commit message? Per our offline discussion, the issue became apparent when the .x4 variant is used with transpose = true. The changes in the tests are mostly NFC with the exception of m16n16k16_mmasync16816_fp16_f16_row_row_row, where the rowB/colB affine maps were incorrect. For the other .x2 test cases, the new affine maps introduced here and the old affine maps in the CHECK statements are equivalent because only the first 16 thread ids in the warp matter.
Nov 7 2022
Nov 1 2022
Fix missing change of nvgpu metadata type to vector<2xi16>
Oct 22 2022
Oct 21 2022
Looks good to me.
Oct 19 2022
Oct 16 2022
Address comment, use strategy of rank-reducing extract slice op on the source of the collapse shape.
Oct 14 2022
I implemented the suggestion today, and it works well, but will put up the cleaned up version early next week.
Oct 13 2022
Oct 12 2022
Looks great, made a few minor comments. Thanks!
Oct 11 2022
Oct 5 2022
Sep 23 2022
Fix missing newline
Sep 22 2022
Sep 21 2022
Closing in favor of introducing interface. Here is the draft diff: https://reviews.llvm.org/D134393
Seems like the utility function mergeOffsetsSizesAndStrides should be added to ViewLikeInterface or Dialect/Utils because it has greater scope. The primary use that comes to mind is that the memref "fold alias ops" (used to be "fold subviews") essential does the same thing (SubView and ExtractSlice are both implement the OffsetSizeAndStrideOpInterface). Therefore, it seems like an opportunity for code de-duplication.
Sep 20 2022
Remove unused changes to RegionUtils.h
Address remaining comments.
Sep 19 2022
Address more comments.
Address reviewer comments
Fix missing doc
Cleanup fix some mistakes and inefficiencies.
More general solution
Sep 16 2022
Handle uses of nested values defined above in same stage / different stage.
May need some additional cleanup.
Sep 15 2022
Fix formatting issues.
Sep 8 2022
Rebase on D133523, resolve linalg dependency issue.
This is needed to address circular dependency blocking https://reviews.llvm.org/D129699.
Depends on https://reviews.llvm.org/D133523 to resolve circular dependency with ViewLikeInterface
Sep 2 2022
Windows bot seems to be failing for unrelated reason
Thanks for adding this.
Sep 1 2022
Fix typo in doc
Aug 31 2022
Expand the documentation for the main class that is added. I checked the result of the Doxygen build to make sure it looks correct with relevant hyperlinks and headings correctly generated.