This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse][gpu] add sm8.0+ tensor core 2:4 sparsity support
ClosedPublic

Authored by K-Wu on May 30 2023, 9:02 PM.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
K-Wu updated this revision to Diff 527471.Jun 1 2023, 9:50 AM

seemingly finished the cuda runtime wrapper part

K-Wu updated this revision to Diff 527496.Jun 1 2023, 10:39 AM

in progress

K-Wu updated this revision to Diff 527630.Jun 1 2023, 2:43 PM

in-progress

K-Wu updated this revision to Diff 527638.Jun 1 2023, 3:01 PM

fix compile error ?

K-Wu updated this revision to Diff 527658.Jun 1 2023, 4:20 PM

add tests and finished lower pass

K-Wu published this revision for review.Jun 1 2023, 4:21 PM
K-Wu added reviewers: PeimingLiu, wrengr, bixia, anlunx.
K-Wu updated this revision to Diff 527662.Jun 1 2023, 4:30 PM

correct type in the test

K-Wu updated this revision to Diff 527667.Jun 1 2023, 4:52 PM

stuck at declaring tuple literal in .mlir

mlir/test/Conversion/GPUCommon/2To4Sparsity/lower-2to4-sparse-to-gpu-runtime-calls.mlir
28

How can I create a three-element tuple literal here?

K-Wu updated this revision to Diff 527668.Jun 1 2023, 4:56 PM

correcting the type

K-Wu updated this revision to Diff 527716.Jun 1 2023, 8:28 PM

the lowering pass now works but I added the env handle to the CreateDnMatOp

K-Wu updated this revision to Diff 527720.Jun 1 2023, 9:01 PM

fixing compile error

K-Wu updated this revision to Diff 527728.Jun 1 2023, 9:51 PM

fixing test script

K-Wu updated this revision to Diff 527729.Jun 1 2023, 9:53 PM

refine names

K-Wu updated this revision to Diff 527736.Jun 1 2023, 10:28 PM

rebase origin/main

K-Wu updated this revision to Diff 527737.Jun 1 2023, 10:32 PM

fixing compile errors caused by rebase

K-Wu updated this revision to Diff 527747.Jun 1 2023, 11:03 PM

fixing merging errors and remove tuple types

K-Wu updated this revision to Diff 527751.Jun 1 2023, 11:13 PM

fixing compile errors; improve documentation

K-Wu updated this revision to Diff 527752.Jun 1 2023, 11:16 PM

fixing merging error

K-Wu updated this revision to Diff 527753.Jun 1 2023, 11:18 PM

fix example a bit

K-Wu updated this revision to Diff 527755.Jun 1 2023, 11:28 PM

fixing compile errors

K-Wu updated this revision to Diff 527756.Jun 1 2023, 11:34 PM

fixing compile errors

K-Wu retitled this revision from [mlir][sparse][gpu] add 2:4 sparsity support via cusparseLt to [mlir][sparse][gpu] add sm8.0+ tensor core 2:4 sparsity support.Jun 1 2023, 11:35 PM
K-Wu updated this revision to Diff 527758.Jun 1 2023, 11:41 PM

fix test error

K-Wu updated this revision to Diff 527896.Jun 2 2023, 10:13 AM

clean up unnecessary test flag addition

K-Wu updated this revision to Diff 527916.Jun 2 2023, 11:21 AM

using cmake find to switch on/off cuda official libraries

K-Wu added inline comments.Jun 2 2023, 11:22 AM
mlir/lib/ExecutionEngine/CMakeLists.txt
194–195

cusparse guarded without adding new build options

216

cusparse guarded without adding new build options

K-Wu updated this revision to Diff 527918.Jun 2 2023, 11:28 AM

fix cmakelists.txt

K-Wu added inline comments.Jun 2 2023, 11:35 AM
mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
547

TODO: transpose mode A and B are specified here

549

TODO: pass in compute_type here
CUSPARSE_COMPUTE {_16F, _32I, _TF32, TF32_FAST}

K-Wu updated this revision to Diff 527982.Jun 2 2023, 2:33 PM

rebase origin/main

K-Wu marked an inline comment as not done.Jun 2 2023, 2:33 PM
K-Wu updated this revision to Diff 528004.Jun 2 2023, 3:00 PM

implementing transpose and compute type

K-Wu updated this revision to Diff 528007.Jun 2 2023, 3:09 PM

done type enum

K-Wu updated this revision to Diff 528008.Jun 2 2023, 3:09 PM

format

K-Wu updated this revision to Diff 528010.Jun 2 2023, 3:12 PM

add todo

K-Wu added inline comments.Jun 2 2023, 3:19 PM
mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
552

done

mlir/test/Conversion/GPUCommon/2To4Sparsity/lower-2to4-sparse-to-gpu-runtime-calls.mlir
28

This is now Variadic and no tuple anymore as either operand or result in the newly added/changed ops

K-Wu updated this revision to Diff 528043.Jun 2 2023, 5:10 PM
K-Wu added a comment.Jun 2 2023, 5:10 PM

fix format

aartbik added inline comments.Jun 5 2023, 4:55 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1675

why did you add the handle to DnMat?

2006

this change seems unrelated to the core change of this revision?

mlir/lib/ExecutionEngine/CMakeLists.txt
193–194

Since you change the logic of the make I would move these two "find_lib" below to L217
Also, no longer say

"we need" as we did for L190

but now say,

"Determine if cuSPARSE lib and cuSPARSELT are available and, if so, set the MLIR_CUDA_SM80_SPARSE_ENABLED flag"

216

so right before this, the two find_lib without REQUIRED now

233

We now have situations like

(1) we cannot find either lib
(2) we find cusparse but not cusparselt
(3) we find both

do we simply want to ONLY support (3) and guard all code with the MLIR_CUDA_SM80_SPARSE_ENABLED
or recognize (1) and (2) with a MLIR_CUDA_SPARSE_ENABLED

?

mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
24–29

since we now have builds possible without cusparse lib, don't we need a macro for this (and all cusparse wrappers?

448

I would move this to L438, and use two empty lines around the #if since it applies to rest of file

593

newline?

K-Wu added inline comments.Jun 5 2023, 5:04 PM
mlir/lib/ExecutionEngine/CMakeLists.txt
233

Working on it. Adding a second flag in the wrapper cpp where CUSPARSE disabled will imply CUSPARSELT disabled

K-Wu updated this revision to Diff 528640.Jun 5 2023, 5:22 PM
K-Wu marked an inline comment as done.

fix one of the ptr to local scope bug and addressing some comments

K-Wu marked an inline comment as done.Jun 5 2023, 5:22 PM
K-Wu added inline comments.
mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
448

addressed. Let me know if your thought is different from my new changes :D

K-Wu updated this revision to Diff 528648.Jun 5 2023, 5:33 PM
K-Wu marked an inline comment as done.

adding unsynced cmake change; addressing comments

K-Wu marked 2 inline comments as done.Jun 5 2023, 5:35 PM
K-Wu added inline comments.
mlir/lib/ExecutionEngine/CMakeLists.txt
193–194

Sorry I failed to sync changes from another machine which I thought had been synced by arcanist.

Please check the following few lines for the improved CMake build flag logic.

Please also check if the runtime flags in the CUDARuntimeWrappers.cpp address the comments.

K-Wu marked 2 inline comments as done.Jun 5 2023, 5:36 PM
K-Wu updated this revision to Diff 528649.Jun 5 2023, 5:37 PM

recover unsynced changes

K-Wu updated this revision to Diff 528666.Jun 5 2023, 8:06 PM

impl alloca for handles

K-Wu added inline comments.Jun 5 2023, 8:12 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1675

cusparseLt's API is a bit different from cusparse and requires environment handle as input.

But I am open to ideas to split the revision. Also it seems feasible to retrieve the env implicitly as well. Let's discuss offline maybe.

2006

When using cusparseLt I get three sizes when invoking spmm_buffer_size. THe three sizes are for workspace, compressed data and compressed buffer. Tablegen asks for a type in the assembly to understand whether the output is one size, or a tuple of three sizes. Let us discuss offline how to improve this.

K-Wu updated this revision to Diff 528669.Jun 5 2023, 8:32 PM

rebase origin/main

K-Wu updated this revision to Diff 528671.Jun 5 2023, 8:39 PM

fix unsynced test error. i hate arcanist

K-Wu updated this revision to Diff 528897.Jun 6 2023, 9:04 AM

format

K-Wu updated this revision to Diff 528899.Jun 6 2023, 9:07 AM

fix typo

aartbik added inline comments.Jun 6 2023, 9:36 AM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1675

Ah for the 2:4 case.

shall we add it to dn_vec too then, for consistency?

mlir/lib/ExecutionEngine/CMakeLists.txt
194–195

I gave this some more thought, and I don't think we have any precedent in LLVM/MLIR building conditionally on the context. In fact, some users may not like this, since exactly the same config will result in different binaries on different installations, causing perhaps hard-to-track bugs on bots.

So let's keep the two flags, but keep the build logic

if(MLIR_ENABLE_CUDA_CUSPARSE)

find_library(CUDA_CUSPARSE_LIBRARY cusparse HINTS ${CMAKE_CUDA_IMPLICIT_LINK_DIRECTORIES} REQUIRED)
 ...
 if(MLIR_ENABLE_CUDA_CUSPARSELT)
   find_library(CUDA_CUSPARSELT_LIBRARY cusparseLt HINTS ${CMAKE_CUDA_IMPLICIT_LINK_DIRECTORIES}  REQUIRED )
   find_path(CUDA_CUSPARSELT_HEADER cusparseLt.h HINTS ${CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES}   REQUIRED)
   ...

Note
(1) the REQUIRED now that the build is requested by config
(2) CUSPARSELT is nested in CUSPARSE, i.e. you cannot have the LT without the non-LT version (seems rasonable)

K-Wu added inline comments.Jun 6 2023, 9:46 AM
mlir/lib/ExecutionEngine/CMakeLists.txt
194–195

That makes sense. Yeah cusparseLt uses some data structures in cusparse so (2) is very reasonable

K-Wu updated this revision to Diff 528914.Jun 6 2023, 10:01 AM

addressing comments

K-Wu marked 3 inline comments as done.Jun 6 2023, 10:01 AM
K-Wu added inline comments.
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1675

That makes sense. Done.

K-Wu marked an inline comment as done.Jun 6 2023, 10:10 AM
K-Wu updated this revision to Diff 528920.Jun 6 2023, 10:11 AM

fixing compile error

K-Wu updated this revision to Diff 528922.Jun 6 2023, 10:13 AM

fixing compile error

K-Wu updated this revision to Diff 528923.Jun 6 2023, 10:15 AM

fixing compile error

aartbik added inline comments.Jun 6 2023, 10:27 AM
mlir/lib/ExecutionEngine/CMakeLists.txt
194–195

This must now go under the CUDA_CUSPARSE_LIBRARY below

otherwise, this line will fail without lib even when macro is not set

197

Try to keep the comments under 80-col by breaking the line

typo: installed

Also, rephrase. Enabling the MLIR_ENABLE_CUDA_CUSPARSELT flag assumes the library is installed ...

198

you don't want to nest this in the one below?

K-Wu updated this revision to Diff 528942.Jun 6 2023, 10:58 AM

addressing comments

K-Wu marked 3 inline comments as done.Jun 6 2023, 10:59 AM
K-Wu updated this revision to Diff 528978.Jun 6 2023, 12:04 PM

addressing comments

K-Wu updated this revision to Diff 528994.Jun 6 2023, 12:37 PM

adding necesary bazel changes. credit to Aart

aartbik added inline comments.Jun 6 2023, 12:56 PM
mlir/lib/ExecutionEngine/CMakeLists.txt
216

This comment is outdated, since we no longer set the flag

  1. Find the libcusparse.so library if CUSPARSE build is requested.
224

Find the libcusparseLt.so library if CUSPARSELT build is requested.

<then rest of your comment, but broken up with 80-cols>

mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
27

I would nest this done inside CUSPARSE #if/#endif
since we only support both set, or former set only

444

move this end to end of file, so we have nested definitions

mlir/test/Conversion/GPUCommon/2To4Sparsity/lit.local.cfg
2

not needed, we are not running this, just lowering

mlir/test/Conversion/GPUCommon/2To4Sparsity/lower-2to4-sparse-to-gpu-runtime-calls.mlir
2

don't introduce a new 2To4Sparsity directory, simply move this test file up to the same level as e.g. lower-sparse-to-gpu-runtime-calls.mlir

K-Wu marked 6 inline comments as done.Jun 6 2023, 1:06 PM
K-Wu updated this revision to Diff 529002.Jun 6 2023, 1:07 PM

fixing bazel errors and addressing comments

K-Wu updated this revision to Diff 529003.Jun 6 2023, 1:09 PM

rebase origin/main

K-Wu updated this revision to Diff 529005.Jun 6 2023, 1:13 PM

fix compile error; add moved new file

did you forget to add the lowering file to this revision after moving up?

mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
818

Also add CreataCooAoS for completeness?
It is deprecated on the lib side, but it still lives as op here

822

Start comments with upper case and end with period

// Print the SpMat defining op.

Here and below a few times

827

a bit strange to return this as string?
Just a bool?

K-Wu updated this revision to Diff 529020.Jun 6 2023, 1:26 PM

fixing compile error

K-Wu updated this revision to Diff 529071.Jun 6 2023, 4:01 PM
K-Wu marked 3 inline comments as done.

addressing comments

K-Wu added a comment.Jun 6 2023, 4:01 PM

Addressed comments

aartbik accepted this revision.Jun 6 2023, 4:10 PM
This revision is now accepted and ready to land.Jun 6 2023, 4:10 PM
This revision was landed with ongoing or failed builds.Jun 6 2023, 4:13 PM
This revision was automatically updated to reflect the committed changes.