(1) keep all cuSparse ops on single stream without wait() in right order
(2) use more type precise memref types for COO
(3) use ToTensor on resulting memref (even though it folds away again)
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp | ||
---|---|---|
392 | Can we do an colA assertion at the beginning of the function instead of doing it twice? |
mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp | ||
---|---|---|
392 | Ah, in this case we could but.... if we implement the "unreachable" part (which I have in my local workspace wiht a COO_AOS) then you have a path without col coming in. So it looks a bit redundant now, but think of "unreachable" testing for !col at least ;-) |
mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp | ||
---|---|---|
392 | okay that makes sense. Thank you. |
Can we do an colA assertion at the beginning of the function instead of doing it twice?