This is an archive of the discontinued LLVM Phabricator instance.

[mlir] [sparse] [gpu] adding transpose support to spmm spmv
ClosedPublic

Authored by K-Wu on May 23 2023, 2:27 PM.

Diff Detail

Event Timeline

K-Wu created this revision.May 23 2023, 2:27 PM
Herald added a reviewer: aartbik. · View Herald Transcript
Herald added a reviewer: dcaballe. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
K-Wu requested review of this revision.May 23 2023, 2:27 PM
wrengr added inline comments.May 23 2023, 2:49 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1808–1810

I think it'd be better to just inline these into the definition of GPU_MatTransposeMode (see SparseTensorSortKindEnum for an example using that style)

1818

It would be better to use GPU_Dialect.cppNamespace instead of this string. It's best to avoid using literals whenever possible, both for future-proofing (not that they're likely to change this one any time soon) and to help guide the reader (i.e., to make explicit when things are definitionally equal, rather than just happening to be contingently equal).

1848

I think it'd be better to name this modeA (or more ideally transposeModeA if there's any chance of there being other "modes" in the future). The term "op" is generally used for operations, which this isn't.

And ditto for the other ops.

1931

Since the GPU_MatTransposeModeAttr arguments seem to be attached to particular GPU_SparseHandle arguments, I think it'd be better to use a syntax that helps make that more clear. For example, you could use something like gpu.spmm_buffersize async [%dep] %env, %spmatA{%opA}, %dnmatB{%opB}, %dnmatC (where I chose to use curly braces since that's what's typically used for attributes).

And ditto for the other ops.

K-Wu added inline comments.May 23 2023, 2:54 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1848

I felt hard to choose when I wrote this code. The reason of using op is that cusparse documentation name this argument as opA. Shall I still use modeA? Do you think it necessary to comment here that this is corresponding to the opA in cusparse documentation?

wrengr added inline comments.May 23 2023, 3:21 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1931

Also, you may want to make these attributes optional, where they default to NON_TRANSPOSE when not provided. Since doing so will help avoid needing to make most of the changes to Conversion/GPUCommon/GPUToLLVMConversion.cpp and the various mlir test files.

I think the cleanest way to do that would be to modify the definition of GPU_MatTransposeModeAttr to specify the default (so it can be shared by all the ops). Though you'll have to make sure to update the assemblyFormat of each op to allow dropping the curly braces when the mode isn't provided. And you'll probably need to add some additional builders for each op, so that they can be defaulted on the C++ side. (Ideally the standard builders should be able to handle this automatically, but I don't recall whether they do or not for this particular case.)

mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
1402–1403

To avoid repeating this idiom all over, you should define a helper function (cf., SparseTensor/Transforms/CodegenUtils for some examples of us doing this elsewhere). I'd say ditto for dType idiom, though those aren't touched by this patch; so feel free to leave this cleanup for a separate CL

1403

Style-guide says to use static_cast here, instead of the C-style cast.

1487

What does this todo actually mean? Cuz it looks like you are passing the transpose modes...

aartbik added inline comments.May 23 2023, 3:28 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1842

This will not be an SSA value, %opA, but an actual "literal" attribute, as in NON_TRANSPOSE, right?
So please update the examples accordingly

K-Wu updated this revision to Diff 524915.May 23 2023, 3:37 PM

addressing wren's comments

K-Wu updated this revision to Diff 524918.May 23 2023, 3:39 PM
K-Wu marked an inline comment as done.

rm TODOs already done

mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
1487

Good catch! I forgot to remove this after finishing this item

K-Wu marked 2 inline comments as done.May 23 2023, 3:59 PM
K-Wu updated this revision to Diff 524924.May 23 2023, 4:01 PM

static_cast

K-Wu marked an inline comment as done.May 23 2023, 4:02 PM
K-Wu marked an inline comment as done.May 23 2023, 4:09 PM
K-Wu updated this revision to Diff 524926.May 23 2023, 4:10 PM

reformat

K-Wu updated this revision to Diff 524928.May 23 2023, 4:25 PM

upd examples in GPUOps.td

K-Wu marked an inline comment as done.May 23 2023, 4:26 PM
K-Wu updated this revision to Diff 524947.May 23 2023, 5:24 PM

rename to TransposeOperator to address concern

K-Wu marked an inline comment as done.May 23 2023, 5:25 PM
K-Wu updated this revision to Diff 524949.May 23 2023, 5:31 PM

renaming opA|B to tOpA|B

K-Wu added inline comments.May 23 2023, 5:33 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1848

Okay let's see if this is better. I now use transposeOpA|B and tOpA|B.

K-Wu updated this revision to Diff 524952.May 23 2023, 5:45 PM

fix compile error

K-Wu updated this revision to Diff 524955.May 23 2023, 6:02 PM

add MatTransposeOp default value

mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1931

This is somewhat addressed by the use of Default_GPU_MatTransposeOpAttr in the last commit

K-Wu updated this revision to Diff 524967.May 23 2023, 6:33 PM

fix stale assemblyFormat in gpu.spmm

K-Wu marked 2 inline comments as done.May 24 2023, 9:29 PM

presumably closing two comments. Please reopen/make new comments if you would like to suggest otherwise.

K-Wu updated this revision to Diff 525700.May 25 2023, 10:39 AM

rebase origin/main

K-Wu updated this revision to Diff 525713.May 25 2023, 11:26 AM

fixing unresolved merge conflict

K-Wu updated this revision to Diff 525714.May 25 2023, 11:27 AM

fixing unresolved merge conflict

K-Wu updated this revision to Diff 525728.May 25 2023, 11:46 AM
K-Wu added a comment.May 25 2023, 11:46 AM

fixing broken build

aartbik added inline comments.May 25 2023, 12:29 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1813

Note that these should always be kept in sync with

typedef enum {

CUSPARSE_OPERATION_NON_TRANSPOSE       = 0,
CUSPARSE_OPERATION_TRANSPOSE           = 1,
CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE = 2

} cusparseOperation_t;

I had a similar dilemma for the index types and the data types.

We definitely do not want to pull in cusparse.h values in here, but we also don't want to introduce inconsistencies later. I have no good solution for this yet, but at least document the requirement for keeping these consistent here

1837

add

NON_TRANSPOSE (the default), ...

since the doc page does not see the code above

1842

I like this syntax! Very neat!

Note that you get bonus points for a syntax where the default is never printed, only the non default cases
(okay for follow up,but we have similar policies for the sparse tensor type features)

1873

same on default

1910

default

1948

default

mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp
745

you can remove the transpose form TODO from L740 ;-) since you have this now

K-Wu updated this revision to Diff 525768.May 25 2023, 1:11 PM

fixing test error

K-Wu added inline comments.May 25 2023, 1:13 PM
mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp
745

This diff introduces the transpose attribute but I haven't added rewrite rules to recognize these patterns in loop nest and set the matrix as transposed. Shall I keep this comment and address it in new diffs?

wrengr added inline comments.May 25 2023, 1:15 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1808

Even if you stick with the "op" name for the operation arguments, you should absolutely not use "op" here— because the "Op" suffix is explicitly and exclusively reserved for operations. Personally I'd go with TransposeMode since the "mat" part doesn't add any information/disambiguation. (If the "mat" prefix is there for consistency with cuSparse names, then I suppose it's okay)

1821–1822

You should instead add let defaultValue = "MatTransposeOp::NON_TRANSPOSE"; to the definition of GPU_MatTransposeOpAttr itself, since it's the natural default value for all uses of that EnumAttr. Whereas DefaultValuedAttr is more for situations where either (a) you don't have control over the original attribute definition and therefore can't change it, or (b) you have some op-specific default that's different from the usual default (or the usual lack thereof).

1848

I'm not a big fan of single-letter abbreviations like that. Single letters like "A"/"X" are okay because those are in some sense their full names, and the "spmat"/"dn" parts are adjectives applying to those names. (And, arguably, the "spmat"/"dn" parts can be dropped since that's already captured by the types themselves and so there would be no ambiguity when seeing getX or getA in the C++ code; though if that's what cuSparse calls them, then there's no harm in keeping the longer names.) Whereas "t" could mean anything, isn't part of a standardized naming scheme (like "A"/"X" are), and doesn't (imo) help to disambiguate what "op" means.

I totally get your motivation for going with "op", and I like where you're mind's at :) It's just unfortunate that "op" means something totally different in MLIR, so there's no good way to be consistent with both cuSparse and MLIR at the same time. Personally, I'd side with MLIR for this case, though I'd defer to our noble TL if he feels otherwise. My reasoning for resolving this case comes from a two-pronged argument about which side readers would be most familiar with, and hence is least likely to cause confusion. Since this is part of the MLIR ecosystem, folks must be at least somewhat familiar with MLIR. And since MLIR isn't ubiquitously popular, that (ironically) makes readers even more likely to be familiar with MLIR or trying to become so. (Whereas things which are ubiquitously popular often have dabblers who care more about The Other Thing.) Conversely, and I may well be wrong about this part, I don't think readers are as likely to be familiar with cuSparse, or rather as likely to have greater familiarity/comfortability with cuSparse than with MLIR.

Though now that you've mentioned that the name comes from cuSparse, I'm thinking I might know where they're coming from. Namely, I'm guessing they mean "op" in the sense of the category theory notation for duality ("A^{op}"), which means different things depending on what the "A" being dualized is: namely for real-matrices it's transpose, but for complex-matrices it's conjugate-transpose. But if that is indeed where they're coming from, then I feel like that's all the more reason to side with MLIR— since anyone who's familiar with the category theory notation will also be familiar with several other notations for these operations, since linear algebraists can't decide between the half-dozen different traditions for what to name them.

That said, if we do side with MLIR over cuSparse, I'm now thinking the better short-name would be "transA" rather than "modeA" (and if you're feeling more verbose, then the better middling-/long-names would be "transposeA" and "transposeModeA")

1856

You should make this whole chunk into an optional-group. I believe "( `{` $tOpA^ `}`)?" should be the correct syntax for that, though you may need to play around with it since sometimes the optional-group stuff doesn't parse in the way it seems like it should. For more information, cf: https://mlir.llvm.org/docs/DefiningDialects/AttributesAndTypes/#optional-and-default-valued-parameters

mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
1203

I think it'd read nicer to use the prefix "constant" rather than "genConstFrom".

And this function is a prime example of why the enum must not be named with "Op": since when reading this I got very confused about why the argument was already an operation rather than the enum/attr, even despite the fact that I'd just re-read through the code above!

1204

Why not just have this function use llvmInt32Type directly? (Since that's the type that's used everywhere, and since this function doesn't do any validation that the parameter is indeed a 32-bit integer type, and since no other type would really make much sense)

mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp
462

Do you actually need to pass this argument since it's the default value? (i.e., do the automatically generated builders allow skipping it) If so, then you should add a new builder definition which uses the default in lieu of requiring an argument here. (And ditto for all the other ops, of course)

mlir/test/Dialect/GPU/ops.mlir
338

If you adjust the assemblyFormat to make the transpose-mode optional like I showed, then you should be able to revert all these changes to the test files (since the generated printers ought to avoid printing the attribute whenever it's the default value)

K-Wu updated this revision to Diff 525787.May 25 2023, 2:01 PM

attempting to remove Default_GPU_MatTransposeOpAttr

K-Wu marked an inline comment as done.May 25 2023, 2:09 PM
K-Wu added inline comments.
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1848

That makes a lot of sense! I didn't know the cetegory theory notation and was confused when I saw cusparse uses op in their documentation. Thanks a lot for this very detailed reasoning. I will follow the MLIR naming convension as you mentioned then.

1856

Noted with great thanks! I really appreciate your mentioning the links and examples in all of your comments. These are great aid to me.

K-Wu marked 4 inline comments as done.May 25 2023, 2:12 PM
K-Wu updated this revision to Diff 525797.May 25 2023, 2:16 PM

explaining default value

K-Wu marked an inline comment as done.May 25 2023, 2:17 PM
K-Wu updated this revision to Diff 525808.May 25 2023, 2:26 PM

cleaning up int32type

K-Wu updated this revision to Diff 525826.May 25 2023, 2:49 PM

fixing error and cleaning up style

aartbik added inline comments.May 25 2023, 2:54 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1808

nit, even though we will think about this, I would phrase it a bit different just in case this remain,
something like

To avoid coupling this dialect with cusparse.h specifics, we hardcoded magic literals in this enum.
Note that this should be kept in sync with cusparseOperation_t in cusparse.h:
typedef enum .....

// todo: find a proper way to keep them in sync?

K-Wu updated this revision to Diff 525839.May 25 2023, 3:17 PM

fixing compile error

K-Wu updated this revision to Diff 525845.May 25 2023, 3:40 PM

fixing compile errors and formatting

K-Wu marked 5 inline comments as done.May 25 2023, 3:45 PM
wrengr added inline comments.May 25 2023, 3:47 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1815

"todo" should be all-caps :)

As for the content of the todo itself: Assuming cusparse.h is part of the cuSparse library itself, at what point does that library become coupled with the mlir libraries? Is it passed at runtime like our sparse-tensor runtime-library, or is it linked in at some earlier stage? If it's linked in before runtime, then in the file that brings them together we can always add a static_assert to ensure they're consistent at that point. Whereas if it's dynamic-linked at runtime, then I'm not sure there's any good way of ensuring consistency that's worth the cost...

mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
1202

If this works, that's cool; but if the callsites don't require knowing that this is the specific operation being returned, then it would probably be better to return Value (both to reduce coupling, and because there are a few places where the use of templates gets a bit squirrely about implicitly converting operations to their result values.)

1202–1204

I'm guessing this isn't going to pass the clang-format checks, so be sure to re-run git clang-format HEAD^

1203

This function doesn't need the specifics of ConversionPatternRewriter, so you should change this parameter to OpBuilder &builder instead.

1206

This doesn't look right to me. The getContext method already returns MLIRContext*, so you don't need the address-of operator there

1207

You should just use rewriter.getIntegerType(32) instead.

mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp
445

MLIR-style says variables should start with lowercase

aartbik added inline comments.May 25 2023, 4:03 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1808

please avoid breaking the 80-col. Just break, and continue on next line (okay to have the Note... more to the right

K-Wu updated this revision to Diff 525857.May 25 2023, 4:06 PM
K-Wu marked 3 inline comments as done.

fixing compiling errors and addressing comments

K-Wu marked 2 inline comments as done.May 25 2023, 4:08 PM
K-Wu marked 4 inline comments as done.May 25 2023, 4:22 PM
K-Wu updated this revision to Diff 525873.May 25 2023, 4:38 PM

try optional attr in builder

K-Wu updated this revision to Diff 525879.May 25 2023, 4:49 PM

reverting

aartbik accepted this revision.May 25 2023, 4:57 PM
aartbik added inline comments.
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1842

maybe use another TRANSPOSE mode in the example, since we won't print the default anymore

1878

same

1916

same, or perhaps even one as default, one different?

mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp
745

nah, I was just observing the overlap in TODOs now, but no biggie

This revision is now accepted and ready to land.May 25 2023, 4:57 PM
K-Wu updated this revision to Diff 525890.May 25 2023, 5:26 PM

attempting to add optional argument

K-Wu updated this revision to Diff 525891.May 25 2023, 5:50 PM

attempting to do default value

K-Wu updated this revision to Diff 525893.May 25 2023, 5:59 PM

attempting to make default value work

K-Wu updated this revision to Diff 525897.May 25 2023, 6:07 PM

attempt

wrengr added inline comments.May 25 2023, 6:08 PM
mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
1836–1838

I wonder if you could move this documentation to the definition of GPU_TransposeModeAttr itself, to avoid needing to repeat if for every op. (Just a nit, not a blocker)

mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp
462

Fwiw, I think this was better when explicitly passing modeA, rather than passing std::nullptr. But the thing I had in mind was to have something more like:

def GPU_SpMVBufferSizeOp : ... {
let builders = [
  OpBuilder<(ins
      Variadic<GPU_AsyncToken>:$asyncDependencies,
      GPU_SparseEnvHandle:$env,
      GPU_SparseSpMatHandle:$spmatA,
      GPU_SparseDnVecHandle:$dnX,
      GPU_SparseDnVecHandle:$dnY), [{
    auto modeA = gpu::TransposeMode::NON_TRANSPOSE;
    return build($_builder, $_state, asyncDependencies, env, modeA, spmatA, dnX, dnY);
  }];
}

...or whatever massaging of that is necessary to get it to compile.

If you're having trouble getting this to work, just send me something on chat and we can try to work out the wrinkles :)

wrengr accepted this revision.May 25 2023, 6:08 PM
K-Wu updated this revision to Diff 525900.May 25 2023, 6:23 PM

using the builder Wren suggested

K-Wu updated this revision to Diff 525904.May 25 2023, 6:35 PM

fixing format and compile error in tablegen

K-Wu updated this revision to Diff 525912.May 25 2023, 7:27 PM

fixing unsupported type in tablegen

K-Wu updated this revision to Diff 525930.May 25 2023, 8:23 PM

fixed compile errors

K-Wu marked 4 inline comments as done.May 25 2023, 8:24 PM
K-Wu added inline comments.
mlir/lib/Dialect/SparseTensor/Transforms/SparseGPUCodegen.cpp
462

I make it work! Let me know if this looks good to you.

This revision was automatically updated to reflect the committed changes.