This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse] Lowering for unary and binary
ClosedPublic

Authored by jim22k on Apr 4 2022, 11:09 AM.

Details

Summary

Code for lowering sparse_tensor.unary and sparse_tensor.binary within a linalg.generic block.

Diff Detail

Event Timeline

jim22k created this revision.Apr 4 2022, 11:09 AM
Herald added a reviewer: aartbik. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
jim22k requested review of this revision.Apr 4 2022, 11:09 AM

@aartbik Please take a look and let me know your thoughts on my general approach. I also indicated where I am stuck and could use some help, as I don't truly understand the lattice-set theory.

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
554

I need help here. I'm not sure what I need to include such that absentVal is added to the output for every missing value in the input. I assume this is something similar to kTensor or kInvariant, as doing v+1 inside linalg.generic will create a dense output. I need something similar to happen here.

617

I'm stuck here. The kBinary below creates one kBinaryRegion and two kUnaryRegions. These are not called/handled for the vector tests. But for the matrix test, the kBinaryRegion is re-evaluated for some reason. The problem is that the origin sparse_tensor.binary operation has already been split apart. I essentially need a no-op here (i.e. it's already working fine. Don't re-evaluate).

I tried calling takeConj again, but that somehow eliminates the disjoint pieces that were added in the kBinary section.

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir
107

This test fails with an assertion error:
Assertion failed: (!add || latGT(p0, p1)), function optimizeSet, file Merger.cpp, line 167.

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir
95

This test passes.

142

This test passes.

162

This test fails with an assertion error:
Assertion failed: (!add || latGT(p0, p1)), function optimizeSet, file Merger.cpp, line 167.

jim22k retitled this revision from Lowering for unary and binary to [mlir][sparse] WIP -- Lowering for unary and binary.Apr 4 2022, 11:58 AM
jim22k edited the summary of this revision. (Show Details)
aartbik added inline comments.Apr 13 2022, 11:44 AM
mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
50

At first reading, I expected the kUnaryRegion and kBinaryRegion, but was a bit surprised with the kUnary and kBinary (since that generalizes a lot of the ops already there). Can you briefly document what the intended differences are here?

Scanning ahead, it seems like one is eventually broken up into several of the others, but I wonder if we can't just keep a single kUnaryRegion and kBinaryRegion, and let the op itself drive the codegen?

99

style: period at end in comment

Also, originally I avoided linking this TensorExpr back to IR of code in MLIR, since during lattice construction, one must be careful to maintain the right 1:1 correspondence between the lattice point and the IR. Obviously, we have little choice for the new binary/unary regions, but let's document that here carefully.

162

ah, I see you solved the issue I alluded to above by passing this explicitly to the merge operations; let's brainstorm later if we can somehow do this using the child nodes only...

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
798–805

This is very surprising? Is this right?

1235

unnecessary format change?

1382

unnecessary format change?

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
25

this is not completely in style with the general setup:

only assign the fields that are know to be set, assert on nullptr in all others

78

I think kUnary/BinaryRegion should be new cases that set op,
all others assert !op

538

note that this could be a simple cast, not a dyn_cast, but more preferable, it looks like we could simply keep a single kUnaryRegion case, see below

554

This is indeed a bit trickier. For the time being, to focus on the overall logic, let's simply put this case under the same as all the other zero preserving unary ops, and assert that absent() is not set. Then we will enhance the logic later.

617

If you rewrite it as suggested below, I think the no-op comes natural.

634

same, should be a cast here, but it looks like we could simply keep a single kBinaryRegion case by looking at the result of this cast (which then needs to be dynamic again ;-) and decide what to do. I think I prefer that a bit more than artificially introducing the kBinary/kUnary as "handled" cases.

641

it feels this whole block of code, L612 to L639 is really takeDisj with some smart selection of the branches.
Perhaps you can split out the analysis of the MLIR IR (getting the three branches), and then write a new

takeDisj(... , opboth, opleft, opright)

and put the takeDisj close to the other, just so that the actual lattic logic is not so deeply burried inside this huge block

739

At first reading, it was surprising to just see "UnaryOp" here, since that looks like just another arith:: version.
Prefixing it with namespace sparse_tensor:: is of course against styleguide, but would look more clear.

Perhaps we should rename the Unary/BinaryOp of sparse tensor dialect a bit more specific (can be done later).

881–882

what does it mean if we hit this part? an empty block? or an error?

jim22k added inline comments.Apr 13 2022, 5:13 PM
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
798–805

This is related to my handling of binary[overlap] and unary[present] when those regions are empty. If you look at Merger.cpp:617, it passes a nullptr for the Operation*. That, in turn, manifests in line 886 and returns Value() for the Value output. When that empty Value reaches this code, we skip the insertion.

It works, but it does feel like a hack. Can you think of a better way to indicate to the generating code that no output is needed?

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
634

That seems reasonable. I will refactor and check the result of dyn_cast. If it fails, it means the BinaryOp has already been handled. buildLattices has unsigned return type, so what should I return in that case? This is where I want a no-op, meaning "don't add any new lattices points".

641

Good idea to move this up near the other takeDisj().

881–882

This means an empty block, so I want to indicate no value.

aartbik added inline comments.Apr 14 2022, 9:19 AM
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
798–805

It was unexpected, but am okay with this if (1) it is documented and (2) perhaps we can even add some form of assert when !rhs that verifies that indeed we hit an expected case

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
634

If at all possible, I would like the "nop" to be detected at a higher level, i.e. before building the new lattices. Otherwise we would have to somehow return the existing set id (returning a nop value is a bit too intrusive to my taste). But after you have done the restructuring suggested here, I will have another look.

jim22k updated this revision to Diff 423272.EditedApr 16 2022, 7:48 PM

Updates based on feedback

  • Create 2nd takeDisj method specifically for binary
  • Eliminate kBinaryRegion/kUnaryRegion in favor of no "handled" cases
  • Add better comments
  • Eliminate the double handling of kBinary by looking at the loop level
jim22k added inline comments.Apr 16 2022, 8:01 PM
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1235

clang-format made this change; not sure why

1382

same as above -- clang-format wanted this changed

1615

This is new and is my attempt to avoid calling kBinary twice so we avoid the need for a "no-op" return.
For matrices, there are two loops, so this piece of code is called twice. If we split apart the binary operation, we run into the need for a no-op. By passing topSort.size() - at, we only split apart the binary in the lowest loop. Previous calls will be a simple takeConj with the binary operation passed unchanged.

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
82

This is a weaker assert, but is required because splitting up the binary result in some pieces which are fundamentally unary, so they don't have a y. In this iteration, I am still labeling all the split up chunks of binary as kBinary.

631

I don't know if this is the correct approach, but it makes the matrix tests all pass. We only split up the binary operation once and we never need to return a no-op. At least, that will hold as long as z==1 for the last time we try to buildLattices() on the binary operation.

659

I created the new takeDisj, which is very nice. It does mean that all of the split up pieces of kBinary remain labeled as kBinary, even for left and right which only have a single input argument.

912

These can be combined in the switch because their logic is identical. The binary operation splits up into pieces which may have 1 or 2 input arguments, so they look just like the unary operation being split up.

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir
107

This is now passing.

jim22k updated this revision to Diff 423458.Apr 18 2022, 1:04 PM
  • Make the absent region of sparse_tensor.unary work
jim22k added inline comments.Apr 18 2022, 1:11 PM
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
554

I figured out how to make the absent region work. I essentially treat it like v + 1 (which is a binary function). Using the same logic, I perform a disjunction. The overlap will be replaced by the present block which only has 1 input argument, so it will sub in the left argument rather than being truly binary.
The rhs is a fixed value, so I create a kInvariant for it and it covers everything which is not a conjunction.

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir
162

This test is now passing.

jim22k updated this revision to Diff 423617.Apr 19 2022, 7:26 AM

Merge latest from main

jim22k updated this revision to Diff 423710.Apr 19 2022, 12:45 PM

Add test involving linalg.index

@aartbik I think this PR is ready for a full review. It's passing all the tests.

  • I do need you to look at the if (z != 1) logic and see if that is reasonable. It works for the tests, but I would appreciate your perspective on where that might break.
  • I also want to discuss the Kind that is assigned when unary and binary are broken up. I currently have the Kind staying unchanged, but that limits what can be asserted in TensorExps constructor. It also makes other checks strange -- for example, mapSet checks that we only pass in unary operations, but I also have to allow kBinary because some of the split up chunks require mapSet.
  • You said you wanted to brainstorm how to pass the blocks around via the children rather than the merge operations directly. I'm open to that if we can figure out a way. Otherwise, I don't think the current implementation is bad.

Thanks for making such diligent progress with this and apologies for the delay. A few high prio tasks came up but from here on it should be a bit smoother sailing!

mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
99

Note that technically this field is only used for unary and binary, but putting this in a union or so seems too convoluted. But just add a comment on when this field is set.

162

I am okay with this for now (and perhaps permanently). It at least makes semantics at top level explicit.

256

"e" and "i" have some designated meaning, so please don't change into im.

Here "z" can use a comment in the method description.
But I am not sure about this approach, see below.

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
798–805

It is a bit strange to refer to "Kinds" here, just rephrase the comment in general terms.
Anything we can assert on rhs or t if not set?

1615

This feels hacky. In the original iteration space theory, building the lattice sets solely depends on the current index, and it has nothing to do with the nesting depth or anything like that. In principle, we need to make this call at every loop nest to determine the remaining expressions.

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
80

assert on op at least

82

This feels hacky. I found these asserts very useful to ensure we are truly doing the right thing. If we introduce all sorts of intermediate stages, as in binary but marked unary etc, such important invariants are lost. Can we cleanup the "top level" logic that transforms binary into unary (this is e.g. also done around binary minus into unary minus)

85

&& op

158

period at end, here and below.

643

period at end

739

braces not needed

782

braces not needed

916

period at end

923

period at end

926

here and below, document this shortcut returns

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir
56

Period at end

57

Rather than adding and modifying an existing test, I would much rather that you add two new integration tests: unary and binary. Yes, it has some boiler plate to setup inputs, but that way it is more clear where the semi-ring stuff is tested.

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir
93

Same here. I would move these into the semi ring integration tests mentioned above, and simply add the novec/vec flags at the top (a test can have more than one RUN/CHECK series)

aartbik retitled this revision from [mlir][sparse] WIP -- Lowering for unary and binary to [mlir][sparse] Lowering for unary and binary.Apr 20 2022, 1:30 PM
jim22k added inline comments.Apr 21 2022, 1:57 PM
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1615

I agree. This was a quick and dirty approach, but I've convinced myself that it is not just hacky, but actually gives wrong results.
I figured out another way which is giving better results and avoids the need for this hack. I will clean up the code and show it off in the next revision.

aartbik added inline comments.Apr 21 2022, 2:42 PM
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1615

Looking forward to that. Thanks!

jim22k updated this revision to Diff 424560.EditedApr 22 2022, 12:02 PM

Change how repeat calls to unary and binary are handled

  • Remove z parameter from buildLattices
  • Add mergeOp to TensorExp
  • Move tests to their own files

Some things to notice with these changes:

  1. There are now two Operation * in TensorExp. This lets us keep the original binary or unary operation as well as the yield operation to be merged after splitting it up into pieces. By keeping the original, it allows us to perform the same logic in further nested loops. In some of the split up pieces (disjoint left or right regions), we do not include the original op. This essentially signals that these pieces have been "handled". Only one region will contain the original and will continue building in the next level down.
  2. There is one case where the Kind changes. kBinary keeps the conjoint piece as kBinary, but the disjoint pieces become kUnary. This lets us retain the check in mapSet that requires a unary kind. If you don't like this change of kind, I can change it. It only affects some of the asserts.
aartbik added inline comments.Apr 25 2022, 4:00 PM
mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447

Can you actually add a test of this, something like

%0 = linalg.generic #trait_vec_op
   ins(%arga, %argb: tensor<?xi64, #SparseVector>, tensor<?xi64, #SparseVector>)
    outs(%xv: tensor<?xi64, #SparseVector>) {
    ^bb(%a: i64, %b: i64, %x: i64):
      %idx = linalg.index 0 : index
      %cst = arith.index_cast %idx : index to i64
      %1 = sparse_tensor.binary %a, %b : i64, i64 to i64
        overlap={
          ^bb0(%a0: i64, %b0: i64):
            sparse_tensor.yield %cst : i64
        }
        left={}
        right={}
      linalg.yield %1 : i64
} -> tensor<?xi64, #SparseVector>

and then see if we indeed generate the right code? I am not 100% sure how the part outside the new op will interact with codegen

mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
99

Please add a bit more description for both, i.e. origOp points to original unary/unary op, mergeOp points to ... before going in detail of what needs to be there

101

This is a better solution than the original nesting depth based one. Still, the danger of a struct is of course that it is easy to add members :-). And although I doubt this is on any memory profile yet, still a bit worrisome to grow the sizeof for all nodes now. So if we keep this, at least add a TODO on improving memory usage in the future.

Furthermore, I am wondering if it perhaps would be better to reflect the state in the type (sort of what you had originally) and just store one op. All of this can be done later, just thinking out loud here.

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
798–805

did you see these comments?

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
150

For this first version, I actually prefer that we keep takeDisj intact (other than passing op)
and add a new

takeCombi

method that implements your new logic, just so we don't need to touch other ops in this revision.
We can perhaps merge these methods later into one, but now too much is changing at once.

536

splinter?

539

I patched in your revision, but got compilation errors here.
I think you need to include

#include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"

550

period at end

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir
43

Here, and probably for the unary too, scan we add a case where one of the operands is dense (either truly dense, or annotated with "dense"). Just to verify that a lattice with "universal" index is properly tested.

209

looks like these two are never release (thus will fail memsan)

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir
41

You have a test with absent, and one with present.
How about also testing when both are set, just for completeness?

Like your

%result = sparse_tensor.unary %a : f64 to i32

present={
  ^bb0(%x: f64):
    %ret = arith.constant 1 : i32
    sparse_tensor.yield %ret : i32
}
absent={
  %ret = arith.constant -1 : i32
  sparse_tensor.yield %ret : i32
}

example?

In fact, even though you test the present case with a matrix, how about also doing that for a vector, just so that all cases are covered for a single loop example. That way, the code also acts like a nice illustration of the feature

113

indentation is off

120

do we want to do the same thing here?
i.e. print values first (to see that we only compute sparse values)
and then print full matrix for stucture?

141

This is never released (and will thus fail our memsan test).

jim22k updated this revision to Diff 425306.Apr 26 2022, 2:33 PM

Change back to one Operation *

  • Add new kBinaryHalf kind
  • kUnary and kBinary hold their original operation until buildExp
  • During buildExp, locate the primary YieldOp and merge that
  • Add more tests
jim22k added inline comments.Apr 27 2022, 7:28 AM
mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447

This will not work. I get this error:

/Users/jkitchen/Projects/HIVE/llvm-project/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir:118:16: error: 'arith.index_cast' op operation destroyed but still has uses
        %cst = arith.index_cast %idx : index to i32
               ^
/Users/jkitchen/Projects/HIVE/llvm-project/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir:118:16: note: see current operation: %0 = "arith.index_cast"(<<NULL VALUE>>) : (<<NULL TYPE>>) -> i32
/Users/jkitchen/Projects/HIVE/llvm-project/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir:113:10: note: - use: "sparse_tensor.lex_insert"(%2, %21, <<UNKNOWN SSA VALUE>>) : (tensor<?xi32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed" ], pointerBitWidth = 0, indexBitWidth = 0 }>>, memref<?xindex>, i32) -> ()

    %0 = linalg.generic #trait_vec_op
         ^
LLVM ERROR: operation destroyed but still has uses

The only way I have been able to use linalg.index is to pass its result into sparse_tensor.binary as one of the arguments, as seen on line 449. That seems to work fine.

mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
101

Alright, I settled on another new approach. We can make this work with only a single Operation * in the struct.

  • kUnary: Operation * is the sparse_tensor.unary
  • kBinary: Operation * is the sparse_tensor.binary
  • kUnaryHandled: Operation * is the YieldOp

When buildLattices encounters kUnary or kBinary, it will build the lattice structure with potentially many regions (i.e. binary always calls takeDisj). The regions which can be considered fully handled will change to kUnaryHandled. The primary region of both unary and binary will remain as kUnary or kBinary and will hold on to the original operation (not the yield). This will let us continue to use the original at every level of the lattice.

When we finally call buildExp, we can dig into the original unary and binary operations and pull out the primary region's YieldOp at that point.

This approach also avoids confusion because kUnary and kBinary always refer to the correct sparse_tensor.unary and sparse_tensor.binary operations. Any disjoint offshoots use a different "Kind" to clarify that they are not the original operation anymore.

I will update the PR with these changes.

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
798–805

I am adding a check to ensure that the kind is either kUnary or kBinary.

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
539

That should already be there. See line 11.

Sorry my comments are out of order with the new PR. They should be read as if they happened before the new PR. I forgot to submit them until now.

A few last comments. Sorry for being nitpicky on this, rest assured I really like the work, I am just a bit particular on certain things.
Also, can you please make a pass over all my comments and mark them "Done" (or comment on them otherwise).
That makes the re-review a bit easier, since a lot of my past comments still show up as unresolv.ed

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
440

See below, but in a nutshell, let's not rewrite the example just because the desired form does not work yet.

447

But by passing it as argument, we lose the intersection/union power of the binary op.
Don't you agree that the code above should work, or the code example you had originally?

(okay doing this later, but I would prefer that we show a binary op example in the doc that uses two sparse inputs *and* uses the index in one or more of the branches, and not rewrite the example to match the current implementation status)

The problem is that we probably need some work dealing with the "outside" block computations that are only used in some places (and thus can be placed under the conditional). We can put a TODO if you prefer.

mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
49

Maybe add a comment, as in

// semi-ring unary op

67

maybe add same-line comment as in

// semi-ring binary op

101

Looks like you iterated over some naming, since I see kBinaryHalf now.
Since we are bikeshedding, how about kBinaryBranch to show we have one case of a binary now?

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
803

if the if-part has braces, so does the else-part, even when it is a single line stmt

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
24–25

: kind(k), val(v), op(operation) {

and then nothing below, since these are not part of a union so can always be set

Also, it feels weird to have a longer parameter name then field name, so perhaps call the parameter simply "o"
(which is in style with the current conventions)

80

can we assert on y in the new model?

142

does this line apply to L144? If so, please please after that line

349

please follow same order as in enum decl.

534

just use cast (not dyn_cast) and no assert, since this should alway work

545

same here and below, if the cast should work, just cast

797

cast, we assume that verifier has filtered out bad cases

913

The unary and binary codegen blocks are a bit too large for this context (most others are oneliners)
so please move into own method

jim22k marked 63 inline comments as done.Apr 28 2022, 9:16 AM
jim22k added inline comments.
mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447

I would hope that your example test (and the documentation example I originally had) would work. If you are okay with a documentation example that gives an error, but is the desired end goal, I will revert to that. I am just sensitive to introducing a new feature with broken examples, giving people who want to try it out a bad impression of the work, unless you are going to figure out how to make it work before we merge.

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
80

We cannot. During the original call, y==-1.
When absentRegion is empty, we call mapSet and y remains equal to -1.
But if absentRegion is not empty, we call takeDisj and both x and y are set.

For this "binary" case handling the absent region, when buildExp is called, we only utilize v0 and completely ignore v1, even though it exists from a lattice perspective.

Would it be helpful to add a comment here about why we can't assert anything about y?

jim22k updated this revision to Diff 425824.Apr 28 2022, 9:52 AM
jim22k marked 2 inline comments as done.

Updates based on feedback

  • Change kBinaryHalf to kBinaryBranch
  • Revert binaryop doc example
  • Remove dynamic casting
aartbik added inline comments.Apr 28 2022, 9:56 AM
mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447

I am okay with that, but please add a small note on that in the doc (this is intended future behavior but still under construction ;-)

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
80

Yeah, please document, so that in the future I do not try to put the assert back ;-)

jim22k marked 2 inline comments as done.Apr 28 2022, 11:51 AM

Okay, adding more comments.

jim22k updated this revision to Diff 425870.Apr 28 2022, 11:51 AM

Few more comments

aartbik accepted this revision.May 2 2022, 5:20 PM

A few last comments, but I am giving you the LGTM, since I think this is ready to go in, so we can work on refining the few remaining issues.
Thanks for your patience during the review, and thanks for this contribution!

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447

I don't see the note in the doc (I really meant, the user facing example, so that people trying this out are aware it does not work yet). So something like

Example of A+B in upper triangle, A-B in lower triangle (not working yet, but construct will be available soon).

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
790

mark these new helpers as "static" since they are private to the file

This revision is now accepted and ready to land.May 2 2022, 5:20 PM
jim22k updated this revision to Diff 426728.May 3 2022, 8:38 AM

Make helper functions static

jim22k marked 2 inline comments as done.May 3 2022, 8:41 AM

@aartbik Thanks for reviewing and helping me converge on a good solution for lowering.
Once the build passes, I will merge it.

This revision was automatically updated to reflect the committed changes.