This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse][gpu] add 2:4 spmm prune_and_check flag
ClosedPublic

Authored by K-Wu on Jul 20 2023, 7:03 PM.

Diff Detail

Event Timeline

K-Wu created this revision.Jul 20 2023, 7:03 PM
Herald added a reviewer: aartbik. · View Herald Transcript
Herald added a reviewer: dcaballe. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
K-Wu requested review of this revision.Jul 20 2023, 7:03 PM
Peiming accepted this revision.Jul 21 2023, 10:05 AM
This revision is now accepted and ready to land.Jul 21 2023, 10:05 AM
wrengr added inline comments.Jul 21 2023, 2:40 PM
mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
751–752

Nit: Why not just combine these into one line?

mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
668–671

Personally I think this (and the below if(prune_flag==2)) would read a lot cleaner if they were phrased in terms of Prune2To4SpMatFlag instead of int32_t. Fwiw, I know why you did it this way. I ran into the same issue about sharing enums between the compiler and the executionengine for the sparse-tensor runtime library, and fixing that takes a fair deal of boilerplate. Nevertheless it's unfortunate that tblgen'ed enums run into this problem.

I'm not suggesting you change it for this CL; rather, going forward you may want to consider factoring out an enums-library to be shared between the compiler and executionengine, since I expect the CUDA wrappers have a whole bunch of this sort of thing. If you want to do that, drop me a line; there are a number of gotchas on the cmake side of things, which I worked out for the sparse-tensor RT and can factor out into a reusable cmake-function

aartbik added inline comments.Jul 31 2023, 3:58 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1717

This is a bit nitpicky, but most enums have very short descriptions in this string field, and typically no verb form
So

"pruning strategy for 2:4 sparse matrix"

would look a bit better here

1723

these let's should be indented two less (move to left)

1749

nitpick, having the enum flag as a regular operand (viz. comma argument) feels a bit out of place, we typically use different syntax for that
can we e.g. put it before the [] to make it stand out?

%spmat, %token = gpu.create_2to4_spmat async PRUNE_AND_CHECK [%dep] %rows, %cols, %mem : memref<?xf64>

I am open for other suggestions, but the comma argument seems not quite right

mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
290

we actually don't document any other flag (except for stream that always is last)
can we make this

{llvmPointerType, llvmInt32Type, llvmInt32Type, llvmPointerType,

llvmPointerType, llvmPointerType, llvmInt32Type, llvmInt32Type,
llvmPointerType /*void *stream*/}};

instead (and yes, perhaps we should document all parameters at one point, but now it feels inconsistent)

751–752

+1

1647

see above on parameter order

mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
633

To avoid starting the sentence with lower case, just say

The parameter prune_flag is used....

Also, see above on order

668–671

+1 on both the sentiment, but also to leave this for a later CL

aartbik accepted this revision.Jul 31 2023, 4:02 PM
wrengr added inline comments.Jul 31 2023, 4:07 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1749

I agree: since the $pruneFlag is an attribute rather than a value, it should either be (a) part of the attr-dict, or (b) use some other syntax that helps it stand out from the value arguments. I'm less sure about putting it before the $asyncDependencies, just because elsewhere we tend to reserve that spot for the $asyncToken; though if we do put the $pruneFlag there, I think it should come before the $asyncToken so that the $asyncToken and $asyncDependencies can remain adjacent

K-Wu updated this revision to Diff 545869.Jul 31 2023, 5:08 PM
K-Wu marked 9 inline comments as done.

addressing comments

K-Wu added a comment.Jul 31 2023, 5:08 PM

addressing comments

K-Wu updated this revision to Diff 545885.Jul 31 2023, 5:54 PM

fix test casess

This revision was automatically updated to reflect the committed changes.