Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/test/Integration/Dialect/SparseTensor/GPU/CUDA/sparse-matmul-2-4-lib.mlir | ||
---|---|---|
17 | This seems a bit copy-and-past from the sparse-mma-2-4-f16.mlir test (which really uses device code for this method by means of e.g. nvgpu.mma.sp.sync). Here, however ,the library calls are still made from the host. So I would remove the while device/host comments here at L17 and at L62). Also, the gpu.container_module is not needed, since no method is defined as gpu.module | |
34 | commented out code? | |
207 | avoid commented out code |
mlir/test/Integration/Dialect/SparseTensor/GPU/CUDA/sparse-matmul-2-4-lib.mlir | ||
---|---|---|
17 | Thanks for all these comments! They are all addressed now |
mlir/test/Integration/Dialect/SparseTensor/GPU/CUDA/sparse-matmul-2-4-lib.mlir | ||
---|---|---|
5 | it looks like this pipeline can be simplified quite a bit, all the gpu.module(....) can go, right? | |
15 | remove gpu.container_module | |
25 | add comment to magic constant here | |
42 | does it work without? in any case, let the TODO jump out a bi t more | |
119 | Copy and paste comment, this is no longer the compressed matrix, but the full 2:4 matrix A | |
145 | empty // line after this comment to seperate it from the CHECK | |
197 | there are no warps in this code, so simply Call the kernel |