This revision extends the GPU dialect with ops that can be lowered to
host-oriented sparse matrix library calls (in this case cuSparse focused
although the ops could be generalized to support more GPUs in principle).
This will allow the "sparse compiler pipeline" to accelerate sparse operations
(see follow up revisions with examples of this).
For some background;
https://discourse.llvm.org/t/sparse-compiler-and-gpu-code-generation/69786/2
nit: remove empty line?