This is an archive of the discontinued LLVM Phabricator instance.

@aartbik Please take a look and let me know your thoughts on my general approach. I also indicated where I am stuck and could use some help, as I don't truly understand the lattice-set theory.

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
555	I need help here. I'm not sure what I need to include such that `absentVal` is added to the output for every missing value in the input. I assume this is something similar to kTensor or kInvariant, as doing `v+1` inside linalg.generic will create a dense output. I need something similar to happen here.
614	I'm stuck here. The `kBinary` below creates one kBinaryRegion and two kUnaryRegions. These are not called/handled for the vector tests. But for the matrix test, the kBinaryRegion is re-evaluated for some reason. The problem is that the origin `sparse_tensor.binary` operation has already been split apart. I essentially need a no-op here (i.e. it's already working fine. Don't re-evaluate). I tried calling `takeConj` again, but that somehow eliminates the disjoint pieces that were added in the kBinary section.
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir
107	This test fails with an assertion error: `Assertion failed: (!add \|\| latGT(p0, p1)), function optimizeSet, file Merger.cpp, line 167.`
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir
95	This test passes.
142	This test passes.
162	This test fails with an assertion error: `Assertion failed: (!add \|\| latGT(p0, p1)), function optimizeSet, file Merger.cpp, line 167.`

Harbormaster completed remote builds in B157780: Diff 420248.Apr 4 2022, 11:44 AM

jim22k retitled this revision from Lowering for unary and binary to [mlir][sparse] WIP -- Lowering for unary and binary.Apr 4 2022, 11:58 AM

jim22k edited the summary of this revision. (Show Details)

Herald added a subscriber: limo1996. · View Herald TranscriptApr 4 2022, 11:58 AM

aartbik added inline comments.Apr 13 2022, 11:44 AM

mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
50	At first reading, I expected the kUnaryRegion and kBinaryRegion, but was a bit surprised with the kUnary and kBinary (since that generalizes a lot of the ops already there). Can you briefly document what the intended differences are here? Scanning ahead, it seems like one is eventually broken up into several of the others, but I wonder if we can't just keep a single kUnaryRegion and kBinaryRegion, and let the op itself drive the codegen?
99	style: period at end in comment Also, originally I avoided linking this TensorExpr back to IR of code in MLIR, since during lattice construction, one must be careful to maintain the right 1:1 correspondence between the lattice point and the IR. Obviously, we have little choice for the new binary/unary regions, but let's document that here carefully.
162	ah, I see you solved the issue I alluded to above by passing this explicitly to the merge operations; let's brainstorm later if we can somehow do this using the child nodes only...
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
797–803	This is very surprising? Is this right?
1226	unnecessary format change?
1373	unnecessary format change?
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
26	this is not completely in style with the general setup: only assign the fields that are know to be set, assert on nullptr in all others
61–79	I think kUnary/BinaryRegion should be new cases that set op, all others assert !op
539	note that this could be a simple cast, not a dyn_cast, but more preferable, it looks like we could simply keep a single kUnaryRegion case, see below
555	This is indeed a bit trickier. For the time being, to focus on the overall logic, let's simply put this case under the same as all the other zero preserving unary ops, and assert that absent() is not set. Then we will enhance the logic later.
614	If you rewrite it as suggested below, I think the no-op comes natural.
631	same, should be a cast here, but it looks like we could simply keep a single kBinaryRegion case by looking at the result of this cast (which then needs to be dynamic again ;-) and decide what to do. I think I prefer that a bit more than artificially introducing the kBinary/kUnary as "handled" cases.
638	it feels this whole block of code, L612 to L639 is really takeDisj with some smart selection of the branches. Perhaps you can split out the analysis of the MLIR IR (getting the three branches), and then write a new takeDisj(... , opboth, opleft, opright) and put the takeDisj close to the other, just so that the actual lattic logic is not so deeply burried inside this huge block
741	At first reading, it was surprising to just see "UnaryOp" here, since that looks like just another arith:: version. Prefixing it with namespace sparse_tensor:: is of course against styleguide, but would look more clear. Perhaps we should rename the Unary/BinaryOp of sparse tensor dialect a bit more specific (can be done later).
855–856	what does it mean if we hit this part? an empty block? or an error?

jim22k added inline comments.Apr 13 2022, 5:13 PM

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
797–803	This is related to my handling of binary[overlap] and unary[present] when those regions are empty. If you look at Merger.cpp:617, it passes a nullptr for the `Operation*`. That, in turn, manifests in line 886 and returns `Value()` for the Value output. When that empty Value reaches this code, we skip the insertion. It works, but it does feel like a hack. Can you think of a better way to indicate to the generating code that no output is needed?
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
631	That seems reasonable. I will refactor and check the result of dyn_cast. If it fails, it means the BinaryOp has already been handled. `buildLattices` has unsigned return type, so what should I return in that case? This is where I want a no-op, meaning "don't add any new lattices points".
638	Good idea to move this up near the other takeDisj().
855–856	This means an empty block, so I want to indicate no value.

aartbik added inline comments.Apr 14 2022, 9:19 AM

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
797–803	It was unexpected, but am okay with this if (1) it is documented and (2) perhaps we can even add some form of assert when !rhs that verifies that indeed we hit an expected case
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
631	If at all possible, I would like the "nop" to be detected at a higher level, i.e. before building the new lattices. Otherwise we would have to somehow return the existing set id (returning a nop value is a bit too intrusive to my taste). But after you have done the restructuring suggested here, I will have another look.

Updates based on feedback

Create 2nd takeDisj method specifically for binary
Eliminate kBinaryRegion/kUnaryRegion in favor of no "handled" cases
Add better comments
Eliminate the double handling of kBinary by looking at the loop level

jim22k added inline comments.Apr 16 2022, 8:01 PM

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1226	clang-format made this change; not sure why
1373	same as above -- clang-format wanted this changed
1606	This is new and is my attempt to avoid calling kBinary twice so we avoid the need for a "no-op" return. For matrices, there are two loops, so this piece of code is called twice. If we split apart the binary operation, we run into the need for a no-op. By passing `topSort.size() - at`, we only split apart the binary in the lowest loop. Previous calls will be a simple takeConj with the binary operation passed unchanged.
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
65	This is a weaker assert, but is required because splitting up the binary result in some pieces which are fundamentally unary, so they don't have a `y`. In this iteration, I am still labeling all the split up chunks of binary as kBinary.
628	I don't know if this is the correct approach, but it makes the matrix tests all pass. We only split up the binary operation once and we never need to return a no-op. At least, that will hold as long as z==1 for the last time we try to buildLattices() on the binary operation.
656	I created the new takeDisj, which is very nice. It does mean that all of the split up pieces of kBinary remain labeled as kBinary, even for `left` and `right` which only have a single input argument.
886	These can be combined in the switch because their logic is identical. The binary operation splits up into pieces which may have 1 or 2 input arguments, so they look just like the unary operation being split up.
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir
107	This is now passing.

Harbormaster completed remote builds in B159958: Diff 423272.Apr 16 2022, 10:25 PM

Make the absent region of sparse_tensor.unary work

jim22k added inline comments.Apr 18 2022, 1:11 PM

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
555	I figured out how to make the `absent` region work. I essentially treat it like `v + 1` (which is a binary function). Using the same logic, I perform a disjunction. The overlap will be replaced by the `present` block which only has 1 input argument, so it will sub in the left argument rather than being truly binary. The rhs is a fixed value, so I create a kInvariant for it and it covers everything which is not a conjunction.
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir
162	This test is now passing.

Harbormaster completed remote builds in B160103: Diff 423458.Apr 18 2022, 1:40 PM

Merge latest from main

Harbormaster completed remote builds in B160229: Diff 423617.Apr 19 2022, 7:42 AM

Add test involving linalg.index

Harbormaster completed remote builds in B160302: Diff 423710.Apr 19 2022, 1:09 PM

@aartbik I think this PR is ready for a full review. It's passing all the tests.

I do need you to look at the if (z != 1) logic and see if that is reasonable. It works for the tests, but I would appreciate your perspective on where that might break.
I also want to discuss the Kind that is assigned when unary and binary are broken up. I currently have the Kind staying unchanged, but that limits what can be asserted in TensorExps constructor. It also makes other checks strange -- for example, mapSet checks that we only pass in unary operations, but I also have to allow kBinary because some of the split up chunks require mapSet.
You said you wanted to brainstorm how to pass the blocks around via the children rather than the merge operations directly. I'm open to that if we can figure out a way. Otherwise, I don't think the current implementation is bad.

Thanks for making such diligent progress with this and apologies for the delay. A few high prio tasks came up but from here on it should be a bit smoother sailing!

mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
99	Note that technically this field is only used for unary and binary, but putting this in a union or so seems too convoluted. But just add a comment on when this field is set.
162	I am okay with this for now (and perhaps permanently). It at least makes semantics at top level explicit.
256	"e" and "i" have some designated meaning, so please don't change into im. Here "z" can use a comment in the method description. But I am not sure about this approach, see below.
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
797–803	It is a bit strange to refer to "Kinds" here, just rephrase the comment in general terms. Anything we can assert on rhs or t if not set?
1606	This feels hacky. In the original iteration space theory, building the lattice sets solely depends on the current index, and it has nothing to do with the nesting depth or anything like that. In principle, we need to make this call at every loop nest to determine the remaining expressions.
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
63	assert on op at least
65	This feels hacky. I found these asserts very useful to ensure we are truly doing the right thing. If we introduce all sorts of intermediate stages, as in binary but marked unary etc, such important invariants are lost. Can we cleanup the "top level" logic that transforms binary into unary (this is e.g. also done around binary minus into unary minus)
68	&& op
159	period at end, here and below.
640	period at end
741	braces not needed
784	braces not needed
890	period at end
897	period at end
900	here and below, document this shortcut returns
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir
56	Period at end
57	Rather than adding and modifying an existing test, I would much rather that you add two new integration tests: unary and binary. Yes, it has some boiler plate to setup inputs, but that way it is more clear where the semi-ring stuff is tested.
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir
93	Same here. I would move these into the semi ring integration tests mentioned above, and simply add the novec/vec flags at the top (a test can have more than one RUN/CHECK series)

aartbik retitled this revision from [mlir][sparse] WIP -- Lowering for unary and binary to [mlir][sparse] Lowering for unary and binary.Apr 20 2022, 1:30 PM

jim22k added inline comments.Apr 21 2022, 1:57 PM

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1606	I agree. This was a quick and dirty approach, but I've convinced myself that it is not just hacky, but actually gives wrong results. I figured out another way which is giving better results and avoids the need for this hack. I will clean up the code and show it off in the next revision.

aartbik added inline comments.Apr 21 2022, 2:42 PM

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1606	Looking forward to that. Thanks!

Change how repeat calls to unary and binary are handled

Remove z parameter from buildLattices
Add mergeOp to TensorExp
Move tests to their own files

Some things to notice with these changes:

There are now two Operation * in TensorExp. This lets us keep the original binary or unary operation as well as the yield operation to be merged after splitting it up into pieces. By keeping the original, it allows us to perform the same logic in further nested loops. In some of the split up pieces (disjoint left or right regions), we do not include the original op. This essentially signals that these pieces have been "handled". Only one region will contain the original and will continue building in the next level down.
There is one case where the Kind changes. kBinary keeps the conjoint piece as kBinary, but the disjoint pieces become kUnary. This lets us retain the check in mapSet that requires a unary kind. If you don't like this change of kind, I can change it. It only affects some of the asserts.

Harbormaster completed remote builds in B160921: Diff 424560.Apr 22 2022, 12:48 PM

aartbik added inline comments.Apr 25 2022, 4:00 PM

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447	Can you actually add a test of this, something like %0 = linalg.generic #trait_vec_op ins(%arga, %argb: tensor<?xi64, #SparseVector>, tensor<?xi64, #SparseVector>) outs(%xv: tensor<?xi64, #SparseVector>) { ^bb(%a: i64, %b: i64, %x: i64): %idx = linalg.index 0 : index %cst = arith.index_cast %idx : index to i64 %1 = sparse_tensor.binary %a, %b : i64, i64 to i64 overlap={ ^bb0(%a0: i64, %b0: i64): sparse_tensor.yield %cst : i64 } left={} right={} linalg.yield %1 : i64 } -> tensor<?xi64, #SparseVector> and then see if we indeed generate the right code? I am not 100% sure how the part outside the new op will interact with codegen
mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
99	Please add a bit more description for both, i.e. origOp points to original unary/unary op, mergeOp points to ... before going in detail of what needs to be there
101	This is a better solution than the original nesting depth based one. Still, the danger of a struct is of course that it is easy to add members :-). And although I doubt this is on any memory profile yet, still a bit worrisome to grow the sizeof for all nodes now. So if we keep this, at least add a TODO on improving memory usage in the future. Furthermore, I am wondering if it perhaps would be better to reflect the state in the type (sort of what you had originally) and just store one op. All of this can be done later, just thinking out loud here.
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
797–803	did you see these comments?
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
151	For this first version, I actually prefer that we keep takeDisj intact (other than passing op) and add a new takeCombi method that implements your new logic, just so we don't need to touch other ops in this revision. We can perhaps merge these methods later into one, but now too much is changing at once.
537	splinter?
540	I patched in your revision, but got compilation errors here. I think you need to include #include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"
551	period at end
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir
43	Here, and probably for the unary too, scan we add a case where one of the operands is dense (either truly dense, or annotated with "dense"). Just to verify that a lattice with "universal" index is properly tested.
209	looks like these two are never release (thus will fail memsan)
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir
41	You have a test with absent, and one with present. How about also testing when both are set, just for completeness? Like your %result = sparse_tensor.unary %a : f64 to i32 present={ ^bb0(%x: f64): %ret = arith.constant 1 : i32 sparse_tensor.yield %ret : i32 } absent={ %ret = arith.constant -1 : i32 sparse_tensor.yield %ret : i32 } example? In fact, even though you test the present case with a matrix, how about also doing that for a vector, just so that all cases are covered for a single loop example. That way, the code also acts like a nice illustration of the feature
113	indentation is off
120	do we want to do the same thing here? i.e. print values first (to see that we only compute sparse values) and then print full matrix for stucture?
141	This is never released (and will thus fail our memsan test).

Change back to one Operation *

Add new kBinaryHalf kind
kUnary and kBinary hold their original operation until buildExp
During buildExp, locate the primary YieldOp and merge that
Add more tests

Harbormaster completed remote builds in B161461: Diff 425306.Apr 26 2022, 3:34 PM

jim22k added inline comments.Apr 27 2022, 7:28 AM

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447	This will not work. I get this error: /Users/jkitchen/Projects/HIVE/llvm-project/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir:118:16: error: 'arith.index_cast' op operation destroyed but still has uses %cst = arith.index_cast %idx : index to i32 ^ /Users/jkitchen/Projects/HIVE/llvm-project/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir:118:16: note: see current operation: %0 = "arith.index_cast"(<<NULL VALUE>>) : (<<NULL TYPE>>) -> i32 /Users/jkitchen/Projects/HIVE/llvm-project/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir:113:10: note: - use: "sparse_tensor.lex_insert"(%2, %21, <<UNKNOWN SSA VALUE>>) : (tensor<?xi32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed" ], pointerBitWidth = 0, indexBitWidth = 0 }>>, memref<?xindex>, i32) -> () %0 = linalg.generic #trait_vec_op ^ LLVM ERROR: operation destroyed but still has uses The only way I have been able to use `linalg.index` is to pass its result into `sparse_tensor.binary` as one of the arguments, as seen on line 449. That seems to work fine.
mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
101	Alright, I settled on another new approach. We can make this work with only a single `Operation ` in the struct. kUnary: Operation is the `sparse_tensor.unary` kBinary: Operation * is the `sparse_tensor.binary` kUnaryHandled: Operation * is the YieldOp When buildLattices encounters kUnary or kBinary, it will build the lattice structure with potentially many regions (i.e. binary always calls takeDisj). The regions which can be considered fully handled will change to kUnaryHandled. The primary region of both unary and binary will remain as kUnary or kBinary and will hold on to the original operation (not the yield). This will let us continue to use the original at every level of the lattice. When we finally call buildExp, we can dig into the original unary and binary operations and pull out the primary region's YieldOp at that point. This approach also avoids confusion because kUnary and kBinary always refer to the correct sparse_tensor.unary and sparse_tensor.binary operations. Any disjoint offshoots use a different "Kind" to clarify that they are not the original operation anymore. I will update the PR with these changes.
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
797–803	I am adding a check to ensure that the kind is either kUnary or kBinary.
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
540	That should already be there. See line 11.

Sorry my comments are out of order with the new PR. They should be read as if they happened before the new PR. I forgot to submit them until now.

A few last comments. Sorry for being nitpicky on this, rest assured I really like the work, I am just a bit particular on certain things.
Also, can you please make a pass over all my comments and mark them "Done" (or comment on them otherwise).
That makes the re-review a bit easier, since a lot of my past comments still show up as unresolv.ed

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
440	See below, but in a nutshell, let's not rewrite the example just because the desired form does not work yet.
447	But by passing it as argument, we lose the intersection/union power of the binary op. Don't you agree that the code above should work, or the code example you had originally? (okay doing this later, but I would prefer that we show a binary op example in the doc that uses two sparse inputs and uses the index in one or more of the branches, and not rewrite the example to match the current implementation status) The problem is that we probably need some work dealing with the "outside" block computations that are only used in some places (and thus can be placed under the conditional). We can put a TODO if you prefer.
mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
49	Maybe add a comment, as in // semi-ring unary op
67	maybe add same-line comment as in // semi-ring binary op
101	Looks like you iterated over some naming, since I see kBinaryHalf now. Since we are bikeshedding, how about kBinaryBranch to show we have one case of a binary now?
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
802	if the if-part has braces, so does the else-part, even when it is a single line stmt
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
25	: kind(k), val(v), op(operation) { and then nothing below, since these are not part of a union so can always be set Also, it feels weird to have a longer parameter name then field name, so perhaps call the parameter simply "o" (which is in style with the current conventions)
63	can we assert on y in the new model?
143	does this line apply to L144? If so, please please after that line
350	please follow same order as in enum decl.
535	just use cast (not dyn_cast) and no assert, since this should alway work
546	same here and below, if the cast should work, just cast
799	cast, we assume that verifier has filtered out bad cases
887	The unary and binary codegen blocks are a bit too large for this context (most others are oneliners) so please move into own method

jim22k marked 63 inline comments as done.Apr 28 2022, 9:16 AM

jim22k added inline comments.

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447	I would hope that your example test (and the documentation example I originally had) would work. If you are okay with a documentation example that gives an error, but is the desired end goal, I will revert to that. I am just sensitive to introducing a new feature with broken examples, giving people who want to try it out a bad impression of the work, unless you are going to figure out how to make it work before we merge.
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
63	We cannot. During the original call, y==-1. When `absentRegion` is empty, we call `mapSet` and y remains equal to -1. But if `absentRegion` is not empty, we call `takeDisj` and both x and y are set. For this "binary" case handling the absent region, when buildExp is called, we only utilize v0 and completely ignore v1, even though it exists from a lattice perspective. Would it be helpful to add a comment here about why we can't assert anything about `y`?

Updates based on feedback

Change kBinaryHalf to kBinaryBranch
Revert binaryop doc example
Remove dynamic casting

aartbik added inline comments.Apr 28 2022, 9:56 AM

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447	I am okay with that, but please add a small note on that in the doc (this is intended future behavior but still under construction ;-)
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
63	Yeah, please document, so that in the future I do not try to put the assert back ;-)

Harbormaster completed remote builds in B161824: Diff 425824.Apr 28 2022, 10:21 AM

Okay, adding more comments.

Few more comments

Harbormaster completed remote builds in B161858: Diff 425870.Apr 28 2022, 1:50 PM

A few last comments, but I am giving you the LGTM, since I think this is ready to go in, so we can work on refining the few remaining issues.
Thanks for your patience during the review, and thanks for this contribution!

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447	I don't see the note in the doc (I really meant, the user facing example, so that people trying this out are aware it does not work yet). So something like Example of A+B in upper triangle, A-B in lower triangle (not working yet, but construct will be available soon).
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
792	mark these new helpers as "static" since they are private to the file

This revision is now accepted and ready to land.May 2 2022, 5:20 PM

Make helper functions static

@aartbik Thanks for reviewing and helping me converge on a good solution for lowering.
Once the build passes, I will merge it.

Harbormaster completed remote builds in B162474: Diff 426728.May 3 2022, 9:57 AM

Closed by commit rG2c3326608460: [mlir][sparse] Add lowering for unary and binary ops (authored by jim22k). · Explain WhyMay 3 2022, 1:51 PM

This revision was automatically updated to reflect the committed changes.

jim22k added a commit: rG2c3326608460: [mlir][sparse] Add lowering for unary and binary ops.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

SparseTensor/

IR/

SparseTensorOps.td

34 lines

Utils/

Merger.h

43 lines

lib/

Dialect/

SparseTensor/

Transforms/

Sparsification.cpp

20 lines

Utils/

Merger.cpp

200 lines

test/

Integration/

Dialect/

SparseTensor/

CPU/

sparse_binary.mlir

294 lines

sparse_matrix_ops.mlir

2 lines

sparse_unary.mlir

205 lines

sparse_vector_ops.mlir

6 lines

Diff 425306

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td

Show First 20 Lines • Show All 409 Lines • ▼ Show 20 Lines	let description = [{

As a convenience, there is also a special token `identity` which can be		As a convenience, there is also a special token `identity` which can be
used in place of the left or right region. This token indicates that		used in place of the left or right region. This token indicates that
the return value is the input value (i.e. func(%x) => return %x).		the return value is the input value (i.e. func(%x) => return %x).
As a practical example, setting `left=identity` and `right=identity`		As a practical example, setting `left=identity` and `right=identity`
would be equivalent to a union operation where non-overlapping values		would be equivalent to a union operation where non-overlapping values
in the inputs are copied to the output unchanged.		in the inputs are copied to the output unchanged.

Example of isEqual applied to intersecting elements only:		Example of isEqual applied to intersecting elements only.
```mlir		```mlir
%C = sparse_tensor.init...		%C = sparse_tensor.init...
%0 = linalg.generic #trait		%0 = linalg.generic #trait
ins(%A: tensor<?xf64, #SparseVec>, %B: tensor<?xf64, #SparseVec>)		ins(%A: tensor<?xf64, #SparseVec>, %B: tensor<?xf64, #SparseVec>)
outs(%C: tensor<?xi8, #SparseVec>) {		outs(%C: tensor<?xi8, #SparseVec>) {
^bb0(%a: f64, %b: f64, %c: i8) :		^bb0(%a: f64, %b: f64, %c: i8) :
%result = sparse_tensor.binary %a, %b : f64, f64 to i8		%result = sparse_tensor.binary %a, %b : f64, f64 to i8
overlap={		overlap={
^bb0(%arg0: f64, %arg1: f64):		^bb0(%arg0: f64, %arg1: f64):
%cmp = arith.cmpf "oeq", %arg0, %arg1 : f64		%cmp = arith.cmpf "oeq", %arg0, %arg1 : f64
%ret_i8 = arith.extui %cmp : i1 to i8		%ret_i8 = arith.extui %cmp : i1 to i8
sparse_tensor.yield %ret_i8 : i8		sparse_tensor.yield %ret_i8 : i8
}		}
left={}		left={}
right={}		right={}
linalg.yield %result : i8		linalg.yield %result : i8
} -> tensor<?xi8, #SparseVec>		} -> tensor<?xi8, #SparseVec>
```		```

Example of A+B in upper triangle, A-B in lower triangle:		Example of replacing every element by its column index. This uses
		`linalg.index` to get the column index, treating the value like a
		dense tensor usable in `sparse_tensor.binary`.
		aartbikUnsubmitted Done Reply Inline Actions See below, but in a nutshell, let's not rewrite the example just because the desired form does not work yet. aartbik: See below, but in a nutshell, let's not rewrite the example just because the desired form does…
```mlir		```mlir
%C = sparse_tensor.init...		%C = sparse_tensor.init...
%1 = linalg.generic #trait		%1 = linalg.generic #trait
ins(%A: tensor<?x?xf64, #CSR>, %B: tensor<?x?xf64, #CSR>		ins(%A: tensor<?x?xf64, #CSR>)
outs(%C: tensor<?x?xf64, #CSR> {		outs(%C: tensor<?x?xi32, #CSR>) {
^bb0(%a: f64, %b: f64, %c: f64) :		^bb0(%a: f64, %c: i32) :
%row = linalg.index 0 : index
%col = linalg.index 1 : index		%col = linalg.index 1 : index
		aartbikUnsubmitted Done Reply Inline Actions Can you actually add a test of this, something like %0 = linalg.generic #trait_vec_op ins(%arga, %argb: tensor<?xi64, #SparseVector>, tensor<?xi64, #SparseVector>) outs(%xv: tensor<?xi64, #SparseVector>) { ^bb(%a: i64, %b: i64, %x: i64): %idx = linalg.index 0 : index %cst = arith.index_cast %idx : index to i64 %1 = sparse_tensor.binary %a, %b : i64, i64 to i64 overlap={ ^bb0(%a0: i64, %b0: i64): sparse_tensor.yield %cst : i64 } left={} right={} linalg.yield %1 : i64 } -> tensor<?xi64, #SparseVector> and then see if we indeed generate the right code? I am not 100% sure how the part outside the new op will interact with codegen aartbik: Can you actually add a test of this, something like %0 = linalg.generic #trait_vec_op…
		jim22kAuthorUnsubmitted Done Reply Inline Actions This will not work. I get this error: /Users/jkitchen/Projects/HIVE/llvm-project/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir:118:16: error: 'arith.index_cast' op operation destroyed but still has uses %cst = arith.index_cast %idx : index to i32 ^ /Users/jkitchen/Projects/HIVE/llvm-project/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir:118:16: note: see current operation: %0 = "arith.index_cast"(<<NULL VALUE>>) : (<<NULL TYPE>>) -> i32 /Users/jkitchen/Projects/HIVE/llvm-project/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir:113:10: note: - use: "sparse_tensor.lex_insert"(%2, %21, <<UNKNOWN SSA VALUE>>) : (tensor<?xi32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed" ], pointerBitWidth = 0, indexBitWidth = 0 }>>, memref<?xindex>, i32) -> () %0 = linalg.generic #trait_vec_op ^ LLVM ERROR: operation destroyed but still has uses The only way I have been able to use `linalg.index` is to pass its result into `sparse_tensor.binary` as one of the arguments, as seen on line 449. That seems to work fine. jim22k: This will not work. I get this error: ``` /Users/jkitchen/Projects/HIVE/llvm…
		aartbikUnsubmitted Done Reply Inline Actions But by passing it as argument, we lose the intersection/union power of the binary op. Don't you agree that the code above should work, or the code example you had originally? (okay doing this later, but I would prefer that we show a binary op example in the doc that uses two sparse inputs and uses the index in one or more of the branches, and not rewrite the example to match the current implementation status) The problem is that we probably need some work dealing with the "outside" block computations that are only used in some places (and thus can be placed under the conditional). We can put a TODO if you prefer. aartbik: But by passing it as argument, we lose the intersection/union power of the binary op. Don't you…
		jim22kAuthorUnsubmitted Done Reply Inline Actions I would hope that your example test (and the documentation example I originally had) would work. If you are okay with a documentation example that gives an error, but is the desired end goal, I will revert to that. I am just sensitive to introducing a new feature with broken examples, giving people who want to try it out a bad impression of the work, unless you are going to figure out how to make it work before we merge. jim22k: I would hope that your example test (and the documentation example I originally had) would work.
		aartbikUnsubmitted Done Reply Inline Actions I am okay with that, but please add a small note on that in the doc (this is intended future behavior but still under construction ;-) aartbik: I am okay with that, but please add a small note on that in the doc (this is intended future…
		aartbikUnsubmitted Done Reply Inline Actions I don't see the note in the doc (I really meant, the user facing example, so that people trying this out are aware it does not work yet). So something like Example of A+B in upper triangle, A-B in lower triangle (not working yet, but construct will be available soon). aartbik: I don't see the note in the doc (I really meant, the user facing example, so that people trying…
%result = sparse_tensor.binary %a, %b : f64, f64 to f64		%col32 = arith.index_cast %col : index to i32
		%result = sparse_tensor.binary %a, %col32 : f64, i32 to i32
overlap={		overlap={
^bb0(%x: f64, %y: f64):		^bb0(%x: f64, %y: i32):
%cmp = arith.cmpi "uge", %column, %row : index		sparse_tensor.yield %y : i32
%upperTriangleResult = arith.addf %x, %y : f64
%lowerTriangleResult = arith.subf %x, %y : f64
%ret = arith.select %cmp, %upperTriangleResult, %lowerTriangleResult : f64
sparse_tensor.yield %ret : f64
}
left=identity
right={
^bb0(%y: f64):
%cmp = arith.cmpi "uge", %column, %row : index
%lowerTriangleResult = arith.negf %y : f64
%ret = arith.select %cmp, %y, %lowerTriangleResult
sparse_tensor.yield %ret : f64
}		}
		left={}
		right={}
linalg.yield %result : f64		linalg.yield %result : f64
} -> tensor<?x?xf64, #CSR>		} -> tensor<?x?xf64, #CSR>
```		```

Example of set difference. Returns a copy of A where its sparse structure		Example of set difference. Returns a copy of A where its sparse structure
is not overlapped by B. The element type of B can be different than A		is not overlapped by B. The element type of B can be different than A
because we never use its values, only its sparse structure.		because we never use its values, only its sparse structure.
```mlir		```mlir
▲ Show 20 Lines • Show All 123 Lines • Show Last 20 Lines

mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h

Show All 40 Lines	enum Kind {
kCastFU, // unsigned		kCastFU, // unsigned
kCastSF, // signed		kCastSF, // signed
kCastUF, // unsigned		kCastUF, // unsigned
kCastS, // signed		kCastS, // signed
kCastU, // unsigned		kCastU, // unsigned
kCastIdx,		kCastIdx,
kTruncI,		kTruncI,
kBitCast,		kBitCast,
		kUnary,
		aartbikUnsubmitted Done Reply Inline Actions Maybe add a comment, as in // semi-ring unary op aartbik: Maybe add a comment, as in // semi-ring unary op
		kBinaryHalf,
		aartbikUnsubmitted Done Reply Inline Actions At first reading, I expected the kUnaryRegion and kBinaryRegion, but was a bit surprised with the kUnary and kBinary (since that generalizes a lot of the ops already there). Can you briefly document what the intended differences are here? Scanning ahead, it seems like one is eventually broken up into several of the others, but I wonder if we can't just keep a single kUnaryRegion and kBinaryRegion, and let the op itself drive the codegen? aartbik: At first reading, I expected the kUnaryRegion and kBinaryRegion, but was a bit surprised with…
// Binary operations.		// Binary operations.
kMulF,		kMulF,
kMulI,		kMulI,
kDivF,		kDivF,
kDivS, // signed		kDivS, // signed
kDivU, // unsigned		kDivU, // unsigned
kAddF,		kAddF,
kAddI,		kAddI,
kSubF,		kSubF,
kSubI,		kSubI,
kAndI,		kAndI,
kOrI,		kOrI,
kXorI,		kXorI,
kShrS, // signed		kShrS, // signed
kShrU, // unsigned		kShrU, // unsigned
kShlI,		kShlI,
		kBinary,
		aartbikUnsubmitted Done Reply Inline Actions maybe add same-line comment as in // semi-ring binary op aartbik: maybe add same-line comment as in // semi-ring binary op
};		};

/// Children subexpressions of tensor operations.		/// Children subexpressions of tensor operations.
struct Children {		struct Children {
unsigned e0;		unsigned e0;
unsigned e1;		unsigned e1;
};		};

/// Tensor expression. Represents a MLIR expression in tensor index notation.		/// Tensor expression. Represents a MLIR expression in tensor index notation.
struct TensorExp {		struct TensorExp {
TensorExp(Kind k, unsigned x, unsigned y, Value v);		TensorExp(Kind k, unsigned x, unsigned y, Value v, Operation *operation);

/// Tensor expression kind.		/// Tensor expression kind.
Kind kind;		Kind kind;

union {		union {
/// Expressions representing tensors simply have a tensor number.		/// Expressions representing tensors simply have a tensor number.
unsigned tensor;		unsigned tensor;

/// Indices hold the index number.		/// Indices hold the index number.
unsigned index;		unsigned index;

/// Tensor operations hold the indices of their children.		/// Tensor operations hold the indices of their children.
Children children;		Children children;
};		};

/// Direct link to IR for an invariant or the destination value (to		/// Direct link to IR for an invariant or the destination value (to
/// infer destination type) of a cast operation During code generation,		/// infer destination type) of a cast operation During code generation,
/// this field may be used to cache "hoisted" loop invariant tensor loads.		/// this field may be used to cache "hoisted" loop invariant tensor loads.
Value val;		Value val;

		/// Code blocks used by unary and binary. For the case of kUnary and
		aartbikUnsubmitted Done Reply Inline Actions style: period at end in comment Also, originally I avoided linking this TensorExpr back to IR of code in MLIR, since during lattice construction, one must be careful to maintain the right 1:1 correspondence between the lattice point and the IR. Obviously, we have little choice for the new binary/unary regions, but let's document that here carefully. aartbik: style: period at end in comment Also, originally I avoided linking this TensorExpr back to IR…
		aartbikUnsubmitted Done Reply Inline Actions Note that technically this field is only used for unary and binary, but putting this in a union or so seems too convoluted. But just add a comment on when this field is set. aartbik: Note that technically this field is only used for unary and binary, but putting this in a union…
		aartbikUnsubmitted Done Reply Inline Actions Please add a bit more description for both, i.e. origOp points to original unary/unary op, mergeOp points to ... before going in detail of what needs to be there aartbik: Please add a bit more description for both, i.e. origOp points to original unary/unary op…
		/// kBinary, this holds the original operation with all regions. For
		/// kBinaryHalf, this holds the YieldOp for the left or right half
		aartbikUnsubmitted Done Reply Inline Actions This is a better solution than the original nesting depth based one. Still, the danger of a struct is of course that it is easy to add members :-). And although I doubt this is on any memory profile yet, still a bit worrisome to grow the sizeof for all nodes now. So if we keep this, at least add a TODO on improving memory usage in the future. Furthermore, I am wondering if it perhaps would be better to reflect the state in the type (sort of what you had originally) and just store one op. All of this can be done later, just thinking out loud here. aartbik: This is a better solution than the original nesting depth based one. Still, the danger of a…
		jim22kAuthorUnsubmitted Done Reply Inline Actions Alright, I settled on another new approach. We can make this work with only a single `Operation ` in the struct. kUnary: Operation is the `sparse_tensor.unary` kBinary: Operation * is the `sparse_tensor.binary` kUnaryHandled: Operation * is the YieldOp When buildLattices encounters kUnary or kBinary, it will build the lattice structure with potentially many regions (i.e. binary always calls takeDisj). The regions which can be considered fully handled will change to kUnaryHandled. The primary region of both unary and binary will remain as kUnary or kBinary and will hold on to the original operation (not the yield). This will let us continue to use the original at every level of the lattice. When we finally call buildExp, we can dig into the original unary and binary operations and pull out the primary region's YieldOp at that point. This approach also avoids confusion because kUnary and kBinary always refer to the correct sparse_tensor.unary and sparse_tensor.binary operations. Any disjoint offshoots use a different "Kind" to clarify that they are not the original operation anymore. I will update the PR with these changes. jim22k: Alright, I settled on another new approach. We can make this work with only a single `Operation…
		aartbikUnsubmitted Done Reply Inline Actions Looks like you iterated over some naming, since I see kBinaryHalf now. Since we are bikeshedding, how about kBinaryBranch to show we have one case of a binary now? aartbik: Looks like you iterated over some naming, since I see kBinaryHalf now. Since we are…
		/// to be merged into a nested scf loop.
		Operation *op;
};		};

/// Lattice point. Each lattice point consists of a conjunction of tensor		/// Lattice point. Each lattice point consists of a conjunction of tensor
/// loop indices (encoded in a bitvector) and the index of the corresponding		/// loop indices (encoded in a bitvector) and the index of the corresponding
/// tensor expression.		/// tensor expression.
struct LatPoint {		struct LatPoint {
LatPoint(unsigned n, unsigned e, unsigned b);		LatPoint(unsigned n, unsigned e, unsigned b);
LatPoint(const BitVector &b, unsigned e);		LatPoint(const BitVector &b, unsigned e);

/// Conjunction of tensor loop indices as bitvector. This represents		/// Conjunction of tensor loop indices as bitvector. This represents
/// all indices involved in the tensor expression		/// all indices involved in the tensor expression
BitVector bits;		BitVector bits;

/// Simplified conjunction of tensor loop indices as bitvector. This		/// Simplified conjunction of tensor loop indices as bitvector. This
/// represents a simplified condition under which this tensor expression		/// represents a simplified condition under which this tensor expression
/// must execute. Pre-computed during codegen to avoid repeated eval.		/// must execute. Pre-computed during codegen to avoid repeated eval.
BitVector simple;		BitVector simple;

/// Index of the tensor expresssion.		/// Index of the tensor expression.
unsigned exp;		unsigned exp;
};		};

/// A class to handle all iteration lattice operations. This class abstracts		/// A class to handle all iteration lattice operations. This class abstracts
/// away from some implementation details of storing iteration lattices and		/// away from some implementation details of storing iteration lattices and
/// tensor expressions. This allows for fine-tuning performance characteristics		/// tensor expressions. This allows for fine-tuning performance characteristics
/// independently from the basic algorithm if bottlenecks are identified.		/// independently from the basic algorithm if bottlenecks are identified.
class Merger {		class Merger {
public:		public:
/// Constructs a merger for the given number of tensors and loops. The		/// Constructs a merger for the given number of tensors and loops. The
/// user supplies the number of tensors involved in the kernel, with the		/// user supplies the number of tensors involved in the kernel, with the
/// last tensor in this set denoting the output tensor. The merger adds an		/// last tensor in this set denoting the output tensor. The merger adds an
/// additional synthetic tensor at the end of this set to represent all		/// additional synthetic tensor at the end of this set to represent all
/// invariant expressions in the kernel.		/// invariant expressions in the kernel.
Merger(unsigned t, unsigned l)		Merger(unsigned t, unsigned l)
: outTensor(t - 1), syntheticTensor(t), numTensors(t + 1), numLoops(l),		: outTensor(t - 1), syntheticTensor(t), numTensors(t + 1), numLoops(l),
hasSparseOut(false), dims(t + 1, std::vector<Dim>(l, Dim::kUndef)) {}		hasSparseOut(false), dims(t + 1, std::vector<Dim>(l, Dim::kUndef)) {}

/// Adds a tensor expression. Returns its index.		/// Adds a tensor expression. Returns its index.
unsigned addExp(Kind k, unsigned e0, unsigned e1 = -1u, Value v = Value());		unsigned addExp(Kind k, unsigned e0, unsigned e1 = -1u, Value v = Value(),
unsigned addExp(Kind k, unsigned e, Value v) { return addExp(k, e, -1u, v); }		Operation *op = nullptr);
unsigned addExp(Kind k, Value v) { return addExp(k, -1u, -1u, v); }		unsigned addExp(Kind k, unsigned e, Value v, Operation *op = nullptr) {
		return addExp(k, e, -1u, v, op);
		}
		unsigned addExp(Kind k, Value v, Operation *op = nullptr) {
		return addExp(k, -1u, -1u, v, op);
		}

/// Adds an iteration lattice point. Returns its index.		/// Adds an iteration lattice point. Returns its index.
unsigned addLat(unsigned t, unsigned i, unsigned e);		unsigned addLat(unsigned t, unsigned i, unsigned e);

/// Adds a new, initially empty, set. Returns its index.		/// Adds a new, initially empty, set. Returns its index.
unsigned addSet();		unsigned addSet();

/// Computes a single conjunction of two lattice points by taking the "union"		/// Computes a single conjunction of two lattice points by taking the "union"
/// of loop indices (effectively constructing a larger "intersection" of those		/// of loop indices (effectively constructing a larger "intersection" of those
/// indices) with a newly constructed tensor (sub)expression of given kind.		/// indices) with a newly constructed tensor (sub)expression of given kind.
/// Returns the index of the new lattice point.		/// Returns the index of the new lattice point.
unsigned conjLatPoint(Kind kind, unsigned p0, unsigned p1);		unsigned conjLatPoint(Kind kind, unsigned p0, unsigned p1,
		Operation *op = nullptr);
		aartbikUnsubmitted Done Reply Inline Actions ah, I see you solved the issue I alluded to above by passing this explicitly to the merge operations; let's brainstorm later if we can somehow do this using the child nodes only... aartbik: ah, I see you solved the issue I alluded to above by passing this explicitly to the merge…
		aartbikUnsubmitted Done Reply Inline Actions I am okay with this for now (and perhaps permanently). It at least makes semantics at top level explicit. aartbik: I am okay with this for now (and perhaps permanently). It at least makes semantics at top level…

/// Conjunctive merge of two lattice sets L0 and L1 is conjunction of		/// Conjunctive merge of two lattice sets L0 and L1 is conjunction of
/// cartesian product. Returns the index of the new set.		/// cartesian product. Returns the index of the new set.
unsigned takeConj(Kind kind, unsigned s0, unsigned s1);		unsigned takeConj(Kind kind, unsigned s0, unsigned s1,
		Operation *op = nullptr);

/// Disjunctive merge of two lattice sets L0 and L1 is (L0 /\_op L1, L0, L1).		/// Disjunctive merge of two lattice sets L0 and L1 is (L0 /\_op L1, L0, L1).
/// Returns the index of the new set.		/// Returns the index of the new set.
unsigned takeDisj(Kind kind, unsigned s0, unsigned s1);		unsigned takeDisj(Kind kind, unsigned s0, unsigned s1,
		Operation *op = nullptr);

		/// Disjunctive merge of two lattice sets L0 and L1 with custom handling of
		/// the overlap, left, and right regions. Any region may be left missing in
		/// the output. Returns the index of the new set.
		unsigned takeCombi(Kind kind, unsigned s0, unsigned s1, Operation *orig,
		bool includeLeft, Kind ltrans, Operation *opleft,
		bool includeRight, Kind rtrans, Operation *opright);

/// Maps the unary operator over the lattice set of the operand, i.e. each		/// Maps the unary operator over the lattice set of the operand, i.e. each
/// lattice point on an expression E is simply copied over, but with OP E		/// lattice point on an expression E is simply copied over, but with OP E
/// as new expression. Returns the index of the new set.		/// as new expression. Returns the index of the new set.
unsigned mapSet(Kind kind, unsigned s0, Value v = Value());		unsigned mapSet(Kind kind, unsigned s0, Value v = Value(),
		Operation *op = nullptr);

/// Optimizes the iteration lattice points in the given set. This		/// Optimizes the iteration lattice points in the given set. This
/// method should be called right before code generation to avoid		/// method should be called right before code generation to avoid
/// generating redundant loops and conditions.		/// generating redundant loops and conditions.
unsigned optimizeSet(unsigned s0);		unsigned optimizeSet(unsigned s0);

/// Simplifies the conditions in a conjunction of a given lattice point		/// Simplifies the conditions in a conjunction of a given lattice point
/// within the given set using just two basic rules:		/// within the given set using just two basic rules:
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	#ifndef NDEBUG
void dumpLat(unsigned p) const;		void dumpLat(unsigned p) const;
void dumpSet(unsigned s) const;		void dumpSet(unsigned s) const;
void dumpBits(const BitVector &bits) const;		void dumpBits(const BitVector &bits) const;
#endif		#endif

/// Builds the iteration lattices in a bottom-up traversal given the remaining		/// Builds the iteration lattices in a bottom-up traversal given the remaining
/// tensor (sub)expression and the next loop index in the iteration graph.		/// tensor (sub)expression and the next loop index in the iteration graph.
/// Returns index of the root expression.		/// Returns index of the root expression.
unsigned buildLattices(unsigned e, unsigned i);		unsigned buildLattices(unsigned e, unsigned i);
		aartbikUnsubmitted Done Reply Inline Actions "e" and "i" have some designated meaning, so please don't change into im. Here "z" can use a comment in the method description. But I am not sure about this approach, see below. aartbik: "e" and "i" have some designated meaning, so please don't change into im. Here "z" can use a…

/// Builds a tensor expression from the given Linalg operation.		/// Builds a tensor expression from the given Linalg operation.
/// Returns index of the root expression on success.		/// Returns index of the root expression on success.
Optional<unsigned> buildTensorExpFromLinalg(linalg::GenericOp op);		Optional<unsigned> buildTensorExpFromLinalg(linalg::GenericOp op);

/// Rebuilds SSA format from a tensor expression.		/// Rebuilds SSA format from a tensor expression.
Value buildExp(PatternRewriter &rewriter, Location loc, unsigned e, Value v0,		Value buildExp(PatternRewriter &rewriter, Location loc, unsigned e, Value v0,
Value v1);		Value v1);
Show All 26 Lines

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp

Show First 20 Lines • Show All 775 Lines • ▼ Show 20 Lines	static Value genTensorLoad(Merger &merger, CodeGen &codegen,
if (codegen.curVecLength > 1)		if (codegen.curVecLength > 1)
return genVectorLoad(codegen, rewriter, ptr, args);		return genVectorLoad(codegen, rewriter, ptr, args);
return rewriter.create<memref::LoadOp>(op.getLoc(), ptr, args);		return rewriter.create<memref::LoadOp>(op.getLoc(), ptr, args);
}		}

/// Generates a store on a dense or sparse tensor.		/// Generates a store on a dense or sparse tensor.
static void genTensorStore(Merger &merger, CodeGen &codegen,		static void genTensorStore(Merger &merger, CodeGen &codegen,
PatternRewriter &rewriter, linalg::GenericOp op,		PatternRewriter &rewriter, linalg::GenericOp op,
Value rhs) {		unsigned exp, Value rhs) {
Location loc = op.getLoc();		Location loc = op.getLoc();
// Test if this is a scalarized reduction.		// Test if this is a scalarized reduction.
if (codegen.redVal) {		if (codegen.redVal) {
if (codegen.curVecLength > 1)		if (codegen.curVecLength > 1)
rhs = rewriter.create<arith::SelectOp>(loc, codegen.curVecMask, rhs,		rhs = rewriter.create<arith::SelectOp>(loc, codegen.curVecMask, rhs,
codegen.redVal);		codegen.redVal);
updateReduc(merger, codegen, rhs);		updateReduc(merger, codegen, rhs);
return;		return;
}		}
// Store during insertion.		// Store during insertion.
OpOperand *t = op.getOutputOperand(0);		OpOperand *t = op.getOutputOperand(0);
if (t == codegen.sparseOut) {		if (t == codegen.sparseOut) {
		if (!rhs) {
		// Only unary and binary are allowed to return uninitialized rhs
		// to indicate missing output.
		Kind kind = merger.exp(exp).kind;
		assert(kind == kUnary \|\| kind == kBinary);
		} else
		aartbikUnsubmitted Done Reply Inline Actions if the if-part has braces, so does the else-part, even when it is a single line stmt aartbik: if the if-part has braces, so does the else-part, even when it is a single line stmt
genInsertionStore(codegen, rewriter, op, t, rhs);		genInsertionStore(codegen, rewriter, op, t, rhs);
		aartbikUnsubmitted Done Reply Inline Actions This is very surprising? Is this right? aartbik: This is very surprising? Is this right?
		jim22kAuthorUnsubmitted Done Reply Inline Actions This is related to my handling of binary[overlap] and unary[present] when those regions are empty. If you look at Merger.cpp:617, it passes a nullptr for the `Operation`. That, in turn, manifests in line 886 and returns `Value()` for the Value output. When that empty Value reaches this code, we skip the insertion. It works, but it does feel like a hack. Can you think of a better way to indicate to the generating code that no output is needed? jim22k:* This is related to my handling of binary[overlap] and unary[present] when those regions are…
		aartbikUnsubmitted Done Reply Inline Actions It was unexpected, but am okay with this if (1) it is documented and (2) perhaps we can even add some form of assert when !rhs that verifies that indeed we hit an expected case aartbik: It was unexpected, but am okay with this if (1) it is documented and (2) perhaps we can even…
		aartbikUnsubmitted Done Reply Inline Actions It is a bit strange to refer to "Kinds" here, just rephrase the comment in general terms. Anything we can assert on rhs or t if not set? aartbik: It is a bit strange to refer to "Kinds" here, just rephrase the comment in general terms.
		aartbikUnsubmitted Done Reply Inline Actions did you see these comments? aartbik: did you see these comments?
		jim22kAuthorUnsubmitted Done Reply Inline Actions I am adding a check to ensure that the kind is either kUnary or kBinary. jim22k: I am adding a check to ensure that the kind is either kUnary or kBinary.
return;		return;
}		}
// Actual store.		// Actual store.
SmallVector<Value, 4> args;		SmallVector<Value, 4> args;
Value ptr = genSubscript(codegen, rewriter, op, t, args);		Value ptr = genSubscript(codegen, rewriter, op, t, args);
if (codegen.curVecLength > 1)		if (codegen.curVecLength > 1)
genVectorStore(codegen, rewriter, rhs, ptr, args);		genVectorStore(codegen, rewriter, rhs, ptr, args);
else		else
▲ Show 20 Lines • Show All 163 Lines • ▼ Show 20 Lines	if (lhs == t) {
codegen.redKind = getReduction(last);		codegen.redKind = getReduction(last);
codegen.redExp = exp;		codegen.redExp = exp;
updateReduc(merger, codegen, load);		updateReduc(merger, codegen, load);
} else {		} else {
Value redVal = codegen.redVal;		Value redVal = codegen.redVal;
updateReduc(merger, codegen, Value());		updateReduc(merger, codegen, Value());
codegen.redExp = -1u;		codegen.redExp = -1u;
codegen.redKind = kNoReduc;		codegen.redKind = kNoReduc;
genTensorStore(merger, codegen, rewriter, op, redVal);		genTensorStore(merger, codegen, rewriter, op, exp, redVal);
}		}
} else {		} else {
// Start or end loop invariant hoisting of a tensor load.		// Start or end loop invariant hoisting of a tensor load.
merger.exp(exp).val =		merger.exp(exp).val =
atStart ? genTensorLoad(merger, codegen, rewriter, op, exp) : Value();		atStart ? genTensorLoad(merger, codegen, rewriter, op, exp) : Value();
}		}
} else if (merger.exp(exp).kind != Kind::kInvariant &&		} else if (merger.exp(exp).kind != Kind::kInvariant &&
merger.exp(exp).kind != Kind::kIndex) {		merger.exp(exp).kind != Kind::kIndex) {
▲ Show 20 Lines • Show All 226 Lines • ▼ Show 20 Lines	static Operation *genFor(Merger &merger, CodeGen &codegen,
if (isVector)		if (isVector)
codegen.curVecMask = genVectorMask(codegen, rewriter, iv, lo, hi, step);		codegen.curVecMask = genVectorMask(codegen, rewriter, iv, lo, hi, step);
return forOp;		return forOp;
}		}

/// Emit a while-loop for co-iteration over multiple indices.		/// Emit a while-loop for co-iteration over multiple indices.
static Operation *genWhile(Merger &merger, CodeGen &codegen,		static Operation *genWhile(Merger &merger, CodeGen &codegen,
PatternRewriter &rewriter, linalg::GenericOp op,		PatternRewriter &rewriter, linalg::GenericOp op,
unsigned idx, bool needsUniv,		unsigned idx, bool needsUniv, BitVector &indices) {
		aartbikUnsubmitted Done Reply Inline Actions unnecessary format change? aartbik: unnecessary format change?
		jim22kAuthorUnsubmitted Done Reply Inline Actions clang-format made this change; not sure why jim22k: clang-format made this change; not sure why
BitVector &indices) {
SmallVector<Type, 4> types;		SmallVector<Type, 4> types;
SmallVector<Value, 4> operands;		SmallVector<Value, 4> operands;
// Construct the while-loop with a parameter for each index.		// Construct the while-loop with a parameter for each index.
Type indexType = rewriter.getIndexType();		Type indexType = rewriter.getIndexType();
for (unsigned b = 0, be = indices.size(); b < be; b++) {		for (unsigned b = 0, be = indices.size(); b < be; b++) {
if (indices[b] && merger.isDim(b, Dim::kSparse)) {		if (indices[b] && merger.isDim(b, Dim::kSparse)) {
unsigned tensor = merger.tensor(b);		unsigned tensor = merger.tensor(b);
assert(idx == merger.index(b));		assert(idx == merger.index(b));
▲ Show 20 Lines • Show All 130 Lines • ▼ Show 20 Lines	rewriter.create<memref::StoreOp>(loc, codegen.loops[idx], codegen.lexIdx,
pos);		pos);
}		}
}		}

/// Generates the induction structure for a while-loop.		/// Generates the induction structure for a while-loop.
static void genWhileInduction(Merger &merger, CodeGen &codegen,		static void genWhileInduction(Merger &merger, CodeGen &codegen,
PatternRewriter &rewriter, linalg::GenericOp op,		PatternRewriter &rewriter, linalg::GenericOp op,
unsigned idx, bool needsUniv,		unsigned idx, bool needsUniv,
BitVector &induction,		BitVector &induction, scf::WhileOp whileOp) {
		aartbikUnsubmitted Done Reply Inline Actions unnecessary format change? aartbik: unnecessary format change?
		jim22kAuthorUnsubmitted Done Reply Inline Actions same as above -- clang-format wanted this changed jim22k: same as above -- clang-format wanted this changed
scf::WhileOp whileOp) {
Location loc = op.getLoc();		Location loc = op.getLoc();
// Finalize each else branch of all if statements.		// Finalize each else branch of all if statements.
if (codegen.redVal \|\| codegen.expValues) {		if (codegen.redVal \|\| codegen.expValues) {
while (auto ifOp = dyn_cast_or_null<scf::IfOp>(		while (auto ifOp = dyn_cast_or_null<scf::IfOp>(
rewriter.getInsertionBlock()->getParentOp())) {		rewriter.getInsertionBlock()->getParentOp())) {
unsigned y = 0;		unsigned y = 0;
SmallVector<Value, 4> yields;		SmallVector<Value, 4> yields;
if (codegen.redVal) {		if (codegen.redVal) {
▲ Show 20 Lines • Show All 208 Lines • ▼ Show 20 Lines
/// and intersections of sparse iterations spaces.		/// and intersections of sparse iterations spaces.
static void genStmt(Merger &merger, CodeGen &codegen, PatternRewriter &rewriter,		static void genStmt(Merger &merger, CodeGen &codegen, PatternRewriter &rewriter,
linalg::GenericOp op, std::vector<unsigned> &topSort,		linalg::GenericOp op, std::vector<unsigned> &topSort,
unsigned exp, unsigned at) {		unsigned exp, unsigned at) {
// At each leaf, assign remaining tensor (sub)expression to output tensor.		// At each leaf, assign remaining tensor (sub)expression to output tensor.
if (at == topSort.size()) {		if (at == topSort.size()) {
unsigned ldx = topSort[at - 1];		unsigned ldx = topSort[at - 1];
Value rhs = genExp(merger, codegen, rewriter, op, exp, ldx);		Value rhs = genExp(merger, codegen, rewriter, op, exp, ldx);
genTensorStore(merger, codegen, rewriter, op, rhs);		genTensorStore(merger, codegen, rewriter, op, exp, rhs);
return;		return;
}		}

// Construct iteration lattices for current loop index, with L0 at top.		// Construct iteration lattices for current loop index, with L0 at top.
unsigned idx = topSort[at];		unsigned idx = topSort[at];
unsigned ldx = at == 0 ? -1u : topSort[at - 1];		unsigned ldx = at == 0 ? -1u : topSort[at - 1];
unsigned lts = merger.optimizeSet(merger.buildLattices(exp, idx));		unsigned lts = merger.optimizeSet(merger.buildLattices(exp, idx));

		jim22kAuthorUnsubmitted Done Reply Inline Actions This is new and is my attempt to avoid calling kBinary twice so we avoid the need for a "no-op" return. For matrices, there are two loops, so this piece of code is called twice. If we split apart the binary operation, we run into the need for a no-op. By passing `topSort.size() - at`, we only split apart the binary in the lowest loop. Previous calls will be a simple takeConj with the binary operation passed unchanged. jim22k: This is new and is my attempt to avoid calling kBinary twice so we avoid the need for a "no-op"…
		aartbikUnsubmitted Done Reply Inline Actions This feels hacky. In the original iteration space theory, building the lattice sets solely depends on the current index, and it has nothing to do with the nesting depth or anything like that. In principle, we need to make this call at every loop nest to determine the remaining expressions. aartbik: This feels hacky. In the original iteration space theory, building the lattice sets solely…
		jim22kAuthorUnsubmitted Done Reply Inline Actions I agree. This was a quick and dirty approach, but I've convinced myself that it is not just hacky, but actually gives wrong results. I figured out another way which is giving better results and avoids the need for this hack. I will clean up the code and show it off in the next revision. jim22k: I agree. This was a quick and dirty approach, but I've convinced myself that it is not just…
		aartbikUnsubmitted Done Reply Inline Actions Looking forward to that. Thanks! aartbik: Looking forward to that. Thanks!
// Start a loop sequence.		// Start a loop sequence.
bool needsUniv = startLoopSeq(merger, codegen, rewriter, op, topSort, exp, at,		bool needsUniv = startLoopSeq(merger, codegen, rewriter, op, topSort, exp, at,
idx, ldx, lts);		idx, ldx, lts);

// Emit a loop for every lattice point L0 >= Li in this loop sequence.		// Emit a loop for every lattice point L0 >= Li in this loop sequence.
unsigned lsize = merger.set(lts).size();		unsigned lsize = merger.set(lts).size();
for (unsigned i = 0; i < lsize; i++) {		for (unsigned i = 0; i < lsize; i++) {
// Start a loop.		// Start a loop.
▲ Show 20 Lines • Show All 123 Lines • Show Last 20 Lines

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp

//===- Merger.cpp - Implementation of iteration lattices ------------------===//		//===- Merger.cpp - Implementation of iteration lattices ------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "mlir/Dialect/SparseTensor/Utils/Merger.h"		#include "mlir/Dialect/SparseTensor/Utils/Merger.h"
#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"		#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"
		#include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"

#include "mlir/IR/Operation.h"		#include "mlir/IR/Operation.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"

namespace mlir {		namespace mlir {
namespace sparse_tensor {		namespace sparse_tensor {

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Constructors.		// Constructors.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

TensorExp::TensorExp(Kind k, unsigned x, unsigned y, Value v)		TensorExp::TensorExp(Kind k, unsigned x, unsigned y, Value v,
		Operation *operation)
: kind(k), val(v) {		: kind(k), val(v) {
		aartbikUnsubmitted Done Reply Inline Actions : kind(k), val(v), op(operation) { and then nothing below, since these are not part of a union so can always be set Also, it feels weird to have a longer parameter name then field name, so perhaps call the parameter simply "o" (which is in style with the current conventions) aartbik: : kind(k), val(v), op(operation) { and then nothing below, since these are not part of a union…
switch (kind) {		switch (kind) {
		aartbikUnsubmitted Done Reply Inline Actions this is not completely in style with the general setup: only assign the fields that are know to be set, assert on nullptr in all others aartbik: this is not completely in style with the general setup: only assign the fields that are know…
case kTensor:		case kTensor:
assert(x != -1u && y == -1u && !v);		assert(x != -1u && y == -1u && !v && !operation);
tensor = x;		tensor = x;
break;		break;
case kInvariant:		case kInvariant:
assert(x == -1u && y == -1u && v);		assert(x == -1u && y == -1u && v && !operation);
break;		break;
case kIndex:		case kIndex:
assert(x != -1u && y == -1u && !v);		assert(x != -1u && y == -1u && !v && !operation);
index = x;		index = x;
break;		break;
case kAbsF:		case kAbsF:
case kCeilF:		case kCeilF:
case kFloorF:		case kFloorF:
case kNegF:		case kNegF:
case kNegI:		case kNegI:
assert(x != -1u && y == -1u && !v);		assert(x != -1u && y == -1u && !v && !operation);
children.e0 = x;		children.e0 = x;
children.e1 = y;		children.e1 = y;
break;		break;
case kTruncF:		case kTruncF:
case kExtF:		case kExtF:
case kCastFS:		case kCastFS:
case kCastFU:		case kCastFU:
case kCastSF:		case kCastSF:
case kCastUF:		case kCastUF:
case kCastS:		case kCastS:
case kCastU:		case kCastU:
case kCastIdx:		case kCastIdx:
case kTruncI:		case kTruncI:
case kBitCast:		case kBitCast:
assert(x != -1u && y == -1u && v);		assert(x != -1u && y == -1u && v && !operation);
children.e0 = x;		children.e0 = x;
children.e1 = y;		children.e1 = y;
break;		break;
		case kUnary:
		assert(x != -1u && !v && operation);
		aartbikUnsubmitted Done Reply Inline Actions assert on op at least aartbik: assert on op at least
		aartbikUnsubmitted Done Reply Inline Actions can we assert on y in the new model? aartbik: can we assert on y in the new model?
		jim22kAuthorUnsubmitted Done Reply Inline Actions We cannot. During the original call, y==-1. When `absentRegion` is empty, we call `mapSet` and y remains equal to -1. But if `absentRegion` is not empty, we call `takeDisj` and both x and y are set. For this "binary" case handling the absent region, when buildExp is called, we only utilize v0 and completely ignore v1, even though it exists from a lattice perspective. Would it be helpful to add a comment here about why we can't assert anything about `y`? jim22k: We cannot. During the original call, y==-1. When `absentRegion` is empty, we call `mapSet` and…
		aartbikUnsubmitted Done Reply Inline Actions Yeah, please document, so that in the future I do not try to put the assert back ;-) aartbik: Yeah, please document, so that in the future I do not try to put the assert back ;-)
		children.e0 = x;
		children.e1 = y;
		jim22kAuthorUnsubmitted Done Reply Inline Actions This is a weaker assert, but is required because splitting up the binary result in some pieces which are fundamentally unary, so they don't have a `y`. In this iteration, I am still labeling all the split up chunks of binary as kBinary. jim22k: This is a weaker assert, but is required because splitting up the binary result in some pieces…
		aartbikUnsubmitted Done Reply Inline Actions This feels hacky. I found these asserts very useful to ensure we are truly doing the right thing. If we introduce all sorts of intermediate stages, as in binary but marked unary etc, such important invariants are lost. Can we cleanup the "top level" logic that transforms binary into unary (this is e.g. also done around binary minus into unary minus) aartbik: This feels hacky. I found these asserts very useful to ensure we are truly doing the right…
		op = operation;
		break;
		case kBinaryHalf:
		aartbikUnsubmitted Done Reply Inline Actions && op aartbik: && op
		assert(x != -1u && y == -1u && !v && operation);
		children.e0 = x;
		children.e1 = y;
		op = operation;
		break;
		case kBinary:
		assert(x != -1u && y != -1u && !v && operation);
		children.e0 = x;
		children.e1 = y;
		op = operation;
		break;
		aartbikUnsubmitted Done Reply Inline Actions I think kUnary/BinaryRegion should be new cases that set op, all others assert !op aartbik: I think kUnary/BinaryRegion should be new cases that set op, all others assert !op
default:		default:
assert(x != -1u && y != -1u && !v);		assert(x != -1u && y != -1u && !v && !operation);
children.e0 = x;		children.e0 = x;
children.e1 = y;		children.e1 = y;
break;		break;
}		}
}		}

LatPoint::LatPoint(unsigned n, unsigned e, unsigned b)		LatPoint::LatPoint(unsigned n, unsigned e, unsigned b)
: bits(n, false), simple(), exp(e) {		: bits(n, false), simple(), exp(e) {
bits.set(b);		bits.set(b);
}		}

LatPoint::LatPoint(const BitVector &b, unsigned e)		LatPoint::LatPoint(const BitVector &b, unsigned e)
: bits(b), simple(), exp(e) {}		: bits(b), simple(), exp(e) {}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Lattice methods.		// Lattice methods.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

unsigned Merger::addExp(Kind k, unsigned e0, unsigned e1, Value v) {		unsigned Merger::addExp(Kind k, unsigned e0, unsigned e1, Value v,
		Operation *op) {
unsigned e = tensorExps.size();		unsigned e = tensorExps.size();
tensorExps.push_back(TensorExp(k, e0, e1, v));		tensorExps.push_back(TensorExp(k, e0, e1, v, op));
return e;		return e;
}		}

unsigned Merger::addLat(unsigned t, unsigned i, unsigned e) {		unsigned Merger::addLat(unsigned t, unsigned i, unsigned e) {
assert(t < numTensors && i < numLoops);		assert(t < numTensors && i < numLoops);
unsigned p = latPoints.size();		unsigned p = latPoints.size();
latPoints.push_back(LatPoint(numLoops * numTensors, e, numTensors * i + t));		latPoints.push_back(LatPoint(numLoops * numTensors, e, numTensors * i + t));
return p;		return p;
}		}

unsigned Merger::addSet() {		unsigned Merger::addSet() {
unsigned s = latSets.size();		unsigned s = latSets.size();
latSets.emplace_back(SmallVector<unsigned, 16>());		latSets.emplace_back(SmallVector<unsigned, 16>());
return s;		return s;
}		}

unsigned Merger::conjLatPoint(Kind kind, unsigned p0, unsigned p1) {		unsigned Merger::conjLatPoint(Kind kind, unsigned p0, unsigned p1,
		Operation *op) {
unsigned p = latPoints.size();		unsigned p = latPoints.size();
BitVector nb = BitVector(latPoints[p0].bits);		BitVector nb = BitVector(latPoints[p0].bits);
nb \|= latPoints[p1].bits;		nb \|= latPoints[p1].bits;
unsigned e = addExp(kind, latPoints[p0].exp, latPoints[p1].exp);		unsigned e = addExp(kind, latPoints[p0].exp, latPoints[p1].exp, Value(), op);
latPoints.push_back(LatPoint(nb, e));		latPoints.push_back(LatPoint(nb, e));
return p;		return p;
}		}

unsigned Merger::takeConj(Kind kind, unsigned s0, unsigned s1) {		unsigned Merger::takeConj(Kind kind, unsigned s0, unsigned s1, Operation *op) {
unsigned s = addSet();		unsigned s = addSet();
for (unsigned p0 : latSets[s0])		for (unsigned p0 : latSets[s0])
for (unsigned p1 : latSets[s1])		for (unsigned p1 : latSets[s1])
latSets[s].push_back(conjLatPoint(kind, p0, p1));		latSets[s].push_back(conjLatPoint(kind, p0, p1, op));
return s;		return s;
}		}

unsigned Merger::takeDisj(Kind kind, unsigned s0, unsigned s1) {		unsigned Merger::takeDisj(Kind kind, unsigned s0, unsigned s1, Operation *op) {
unsigned s = takeConj(kind, s0, s1);		unsigned s = takeConj(kind, s0, s1, op);
// Followed by all in s0.		// Followed by all in s0.
for (unsigned p : latSets[s0])		for (unsigned p : latSets[s0])
latSets[s].push_back(p);		latSets[s].push_back(p);
		// TODO: move this logic into buildLattices
		aartbikUnsubmitted Done Reply Inline Actions does this line apply to L144? If so, please please after that line aartbik: does this line apply to L144? If so, please please after that line
// Map binary 0-y to unary -y.		// Map binary 0-y to unary -y.
if (kind == kSubF)		if (kind == kSubF)
s1 = mapSet(kNegF, s1);		s1 = mapSet(kNegF, s1);
else if (kind == kSubI)		else if (kind == kSubI)
s1 = mapSet(kNegI, s1);		s1 = mapSet(kNegI, s1);
// Followed by all in s1.		// Followed by all in s1.
for (unsigned p : latSets[s1])		for (unsigned p : latSets[s1])
latSets[s].push_back(p);		latSets[s].push_back(p);
		aartbikUnsubmitted Done Reply Inline Actions For this first version, I actually prefer that we keep takeDisj intact (other than passing op) and add a new takeCombi method that implements your new logic, just so we don't need to touch other ops in this revision. We can perhaps merge these methods later into one, but now too much is changing at once. aartbik: For this first version, I actually prefer that we keep takeDisj intact (other than passing op)…
return s;		return s;
}		}

unsigned Merger::mapSet(Kind kind, unsigned s0, Value v) {		unsigned Merger::takeCombi(Kind kind, unsigned s0, unsigned s1, Operation *orig,
assert(kAbsF <= kind && kind <= kBitCast);		bool includeLeft, Kind ltrans, Operation *opleft,
		bool includeRight, Kind rtrans, Operation *opright) {
		unsigned s = takeConj(kind, s0, s1, orig);
		// Left Region.
		aartbikUnsubmitted Done Reply Inline Actions period at end, here and below. aartbik: period at end, here and below.
		if (includeLeft) {
		if (opleft)
		s0 = mapSet(ltrans, s0, Value(), opleft);
		for (unsigned p : latSets[s0])
		latSets[s].push_back(p);
		}
		// Right Region.
		if (includeRight) {
		if (opright)
		s1 = mapSet(rtrans, s1, Value(), opright);
		for (unsigned p : latSets[s1])
		latSets[s].push_back(p);
		}
		return s;
		}

		unsigned Merger::mapSet(Kind kind, unsigned s0, Value v, Operation *op) {
		assert(kAbsF <= kind && kind <= kBinaryHalf);
unsigned s = addSet();		unsigned s = addSet();
for (unsigned p : latSets[s0]) {		for (unsigned p : latSets[s0]) {
unsigned e = addExp(kind, latPoints[p].exp, v);		unsigned e = addExp(kind, latPoints[p].exp, v, op);
latPoints.push_back(LatPoint(latPoints[p].bits, e));		latPoints.push_back(LatPoint(latPoints[p].bits, e));
latSets[s].push_back(latPoints.size() - 1);		latSets[s].push_back(latPoints.size() - 1);
}		}
return s;		return s;
}		}

unsigned Merger::optimizeSet(unsigned s0) {		unsigned Merger::optimizeSet(unsigned s0) {
unsigned s = addSet();		unsigned s = addSet();
▲ Show 20 Lines • Show All 153 Lines • ▼ Show 20 Lines	static const char *kindToOpSymbol(Kind kind) {
case kCastSF:		case kCastSF:
case kCastUF:		case kCastUF:
case kCastS:		case kCastS:
case kCastU:		case kCastU:
case kCastIdx:		case kCastIdx:
case kTruncI:		case kTruncI:
case kBitCast:		case kBitCast:
return "cast";		return "cast";
		case kBinaryHalf:
		aartbikUnsubmitted Done Reply Inline Actions please follow same order as in enum decl. aartbik: please follow same order as in enum decl.
		return "binary_half";
		case kUnary:
		return "unary";
case kMulF:		case kMulF:
return "*";		return "*";
case kMulI:		case kMulI:
return "*";		return "*";
case kDivF:		case kDivF:
return "/";		return "/";
case kDivS:		case kDivS:
return "/";		return "/";
Show All 14 Lines	static const char *kindToOpSymbol(Kind kind) {
case kXorI:		case kXorI:
return "^";		return "^";
case kShrS:		case kShrS:
return "a>>";		return "a>>";
case kShrU:		case kShrU:
return ">>";		return ">>";
case kShlI:		case kShlI:
return "<<";		return "<<";
		case kBinary:
		return "binary";
}		}
llvm_unreachable("unexpected kind for symbol");		llvm_unreachable("unexpected kind for symbol");
}		}

void Merger::dumpExp(unsigned e) const {		void Merger::dumpExp(unsigned e) const {
switch (tensorExps[e].kind) {		switch (tensorExps[e].kind) {
case kTensor:		case kTensor:
if (tensorExps[e].tensor == syntheticTensor)		if (tensorExps[e].tensor == syntheticTensor)
▲ Show 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	case kBitCast:
// A zero preserving operation (viz. f(0) = 0, [Bik96,Ch5]) maps the		// A zero preserving operation (viz. f(0) = 0, [Bik96,Ch5]) maps the
// lattice set of the operand through the operator into a new set.		// lattice set of the operand through the operator into a new set.
//		//
// -y\|!y \| y \|		// -y\|!y \| y \|
// --+---+---+		// --+---+---+
// \| 0 \|-y \|		// \| 0 \|-y \|
return mapSet(kind, buildLattices(tensorExps[e].children.e0, i),		return mapSet(kind, buildLattices(tensorExps[e].children.e0, i),
tensorExps[e].val);		tensorExps[e].val);
		case kUnary:
		// A custom unary operation.
		//
		// op y\| !y \| y \|
		// ----+----------+------------+
		// \| absent() \| present(y) \|
		{
		unsigned child0 = buildLattices(tensorExps[e].children.e0, i);
		UnaryOp unop = dyn_cast<UnaryOp>(tensorExps[e].op);
		aartbikUnsubmitted Done Reply Inline Actions just use cast (not dyn_cast) and no assert, since this should alway work aartbik: just use cast (not dyn_cast) and no assert, since this should alway work
		assert(unop);
		Region &absentRegion = unop.absentRegion();
		aartbikUnsubmitted Done Reply Inline Actions splinter? aartbik: splinter?

		if (absentRegion.empty()) {
		aartbikUnsubmitted Done Reply Inline Actions note that this could be a simple cast, not a dyn_cast, but more preferable, it looks like we could simply keep a single kUnaryRegion case, see below aartbik: note that this could be a simple cast, not a dyn_cast, but more preferable, it looks like we…
		// Simple mapping over existing values.
		aartbikUnsubmitted Done Reply Inline Actions I patched in your revision, but got compilation errors here. I think you need to include #include "mlir/Dialect/SparseTensor/IR/SparseTensor.h" aartbik: I patched in your revision, but got compilation errors here. I think you need to include…
		jim22kAuthorUnsubmitted Done Reply Inline Actions That should already be there. See line 11. jim22k: That should already be there. See line 11.
		return mapSet(kind, child0, Value(), unop);
		} else {
		// Use a disjunction with `unop` on the left and the absent value as an
		// invariant on the right.
		Block &absentBlock = absentRegion.front();
		YieldOp absentYield = dyn_cast<YieldOp>(absentBlock.getTerminator());
		aartbikUnsubmitted Done Reply Inline Actions same here and below, if the cast should work, just cast aartbik: same here and below, if the cast should work, just cast
		Value absentVal = absentYield.result();
		unsigned rhs = addExp(kInvariant, absentVal);
		return takeDisj(kind, child0, buildLattices(rhs, i), unop);
		}
		}
		aartbikUnsubmitted Done Reply Inline Actions period at end aartbik: period at end
case kMulF:		case kMulF:
case kMulI:		case kMulI:
case kAndI:		case kAndI:
// A multiplicative operation only needs to be performed		// A multiplicative operation only needs to be performed
		jim22kAuthorUnsubmitted Done Reply Inline Actions I need help here. I'm not sure what I need to include such that `absentVal` is added to the output for every missing value in the input. I assume this is something similar to kTensor or kInvariant, as doing `v+1` inside linalg.generic will create a dense output. I need something similar to happen here. jim22k: I need help here. I'm not sure what I need to include such that `absentVal` is added to the…
		aartbikUnsubmitted Done Reply Inline Actions This is indeed a bit trickier. For the time being, to focus on the overall logic, let's simply put this case under the same as all the other zero preserving unary ops, and assert that absent() is not set. Then we will enhance the logic later. aartbik: This is indeed a bit trickier. For the time being, to focus on the overall logic, let's simply…
		jim22kAuthorUnsubmitted Done Reply Inline Actions I figured out how to make the `absent` region work. I essentially treat it like `v + 1` (which is a binary function). Using the same logic, I perform a disjunction. The overlap will be replaced by the `present` block which only has 1 input argument, so it will sub in the left argument rather than being truly binary. The rhs is a fixed value, so I create a kInvariant for it and it covers everything which is not a conjunction. jim22k: I figured out how to make the `absent` region work. I essentially treat it like `v + 1` (which…
// for the conjunction of sparse iteration spaces.		// for the conjunction of sparse iteration spaces.
//		//
// x*y\|!y \| y \|		// x*y\|!y \| y \|
// ---+---+---+		// ---+---+---+
// !x \| 0 \| 0 \|		// !x \| 0 \| 0 \|
// x \| 0 \|x*y\|		// x \| 0 \|x*y\|
return takeConj(kind, // take binary conjunction		return takeConj(kind, // take binary conjunction
buildLattices(tensorExps[e].children.e0, i),		buildLattices(tensorExps[e].children.e0, i),
Show All 39 Lines	unsigned Merger::buildLattices(unsigned e, unsigned i) {
case kShlI:		case kShlI:
// A shift operation by an invariant amount (viz. tensor expressions		// A shift operation by an invariant amount (viz. tensor expressions
// can only occur at the left-hand-side of the operator) can be handled		// can only occur at the left-hand-side of the operator) can be handled
// with the conjuction rule.		// with the conjuction rule.
assert(isInvariant(tensorExps[e].children.e1));		assert(isInvariant(tensorExps[e].children.e1));
return takeConj(kind, // take binary conjunction		return takeConj(kind, // take binary conjunction
buildLattices(tensorExps[e].children.e0, i),		buildLattices(tensorExps[e].children.e0, i),
buildLattices(tensorExps[e].children.e1, i));		buildLattices(tensorExps[e].children.e1, i));
		case kBinaryHalf:
		// The left or right half of a binary operation which has already
		// been split into separate operations for each region.
		return mapSet(kind, buildLattices(tensorExps[e].children.e0, i), Value(),
		jim22kAuthorUnsubmitted Done Reply Inline Actions I'm stuck here. The `kBinary` below creates one kBinaryRegion and two kUnaryRegions. These are not called/handled for the vector tests. But for the matrix test, the kBinaryRegion is re-evaluated for some reason. The problem is that the origin `sparse_tensor.binary` operation has already been split apart. I essentially need a no-op here (i.e. it's already working fine. Don't re-evaluate). I tried calling `takeConj` again, but that somehow eliminates the disjoint pieces that were added in the kBinary section. jim22k: I'm stuck here. The `kBinary` below creates one kBinaryRegion and two kUnaryRegions. These are…
		aartbikUnsubmitted Done Reply Inline Actions If you rewrite it as suggested below, I think the no-op comes natural. aartbik: If you rewrite it as suggested below, I think the no-op comes natural.
		tensorExps[e].op);
		case kBinary:
		// A custom binary operation.
		//
		// x op y\| !y \| y \|
		// ------+---------+--------------+
		// !x \| empty \| right(y) \|
		// x \| left(x) \| overlap(x,y) \|
		{
		unsigned child0 = buildLattices(tensorExps[e].children.e0, i);
		unsigned child1 = buildLattices(tensorExps[e].children.e1, i);
		BinaryOp binop = dyn_cast<BinaryOp>(tensorExps[e].op);
		assert(binop);
		Region &leftRegion = binop.leftRegion();
		jim22kAuthorUnsubmitted Done Reply Inline Actions I don't know if this is the correct approach, but it makes the matrix tests all pass. We only split up the binary operation once and we never need to return a no-op. At least, that will hold as long as z==1 for the last time we try to buildLattices() on the binary operation. jim22k: I don't know if this is the correct approach, but it makes the matrix tests all pass. We only…
		Region &rightRegion = binop.rightRegion();
		// Left Region.
		Operation *leftYield = nullptr;
		aartbikUnsubmitted Done Reply Inline Actions same, should be a cast here, but it looks like we could simply keep a single kBinaryRegion case by looking at the result of this cast (which then needs to be dynamic again ;-) and decide what to do. I think I prefer that a bit more than artificially introducing the kBinary/kUnary as "handled" cases. aartbik: same, should be a cast here, but it looks like we could simply keep a single kBinaryRegion…
		jim22kAuthorUnsubmitted Done Reply Inline Actions That seems reasonable. I will refactor and check the result of dyn_cast. If it fails, it means the BinaryOp has already been handled. `buildLattices` has unsigned return type, so what should I return in that case? This is where I want a no-op, meaning "don't add any new lattices points". jim22k: That seems reasonable. I will refactor and check the result of dyn_cast. If it fails, it means…
		aartbikUnsubmitted Done Reply Inline Actions If at all possible, I would like the "nop" to be detected at a higher level, i.e. before building the new lattices. Otherwise we would have to somehow return the existing set id (returning a nop value is a bit too intrusive to my taste). But after you have done the restructuring suggested here, I will have another look. aartbik: If at all possible, I would like the "nop" to be detected at a higher level, i.e. before…
		if (!leftRegion.empty()) {
		Block &leftBlock = leftRegion.front();
		leftYield = leftBlock.getTerminator();
		}
		// Right Region.
		Operation *rightYield = nullptr;
		if (!rightRegion.empty()) {
		aartbikUnsubmitted Done Reply Inline Actions it feels this whole block of code, L612 to L639 is really takeDisj with some smart selection of the branches. Perhaps you can split out the analysis of the MLIR IR (getting the three branches), and then write a new takeDisj(... , opboth, opleft, opright) and put the takeDisj close to the other, just so that the actual lattic logic is not so deeply burried inside this huge block aartbik: it feels this whole block of code, L612 to L639 is really takeDisj with some smart selection of…
		jim22kAuthorUnsubmitted Done Reply Inline Actions Good idea to move this up near the other takeDisj(). jim22k: Good idea to move this up near the other takeDisj().
		Block &rightBlock = rightRegion.front();
		rightYield = rightBlock.getTerminator();
		aartbikUnsubmitted Done Reply Inline Actions period at end aartbik: period at end
		}
		bool includeLeft = binop.left_identity() \|\| !leftRegion.empty();
		bool includeRight = binop.right_identity() \|\| !rightRegion.empty();
		return takeCombi(kBinary, child0, child1, binop, includeLeft, kBinaryHalf,
		leftYield, includeRight, kBinaryHalf, rightYield);
		}
}		}
llvm_unreachable("unexpected expression kind");		llvm_unreachable("unexpected expression kind");
}		}

Optional<unsigned> Merger::buildTensorExpFromLinalg(linalg::GenericOp op) {		Optional<unsigned> Merger::buildTensorExpFromLinalg(linalg::GenericOp op) {
Operation *yield = op.region().front().getTerminator();		Operation *yield = op.region().front().getTerminator();
return buildTensorExp(op, yield->getOperand(0));		return buildTensorExp(op, yield->getOperand(0));
}		}

/// Only returns false if we are certain this is a nonzero.		/// Only returns false if we are certain this is a nonzero.
		jim22kAuthorUnsubmitted Done Reply Inline Actions I created the new takeDisj, which is very nice. It does mean that all of the split up pieces of kBinary remain labeled as kBinary, even for `left` and `right` which only have a single input argument. jim22k: I created the new takeDisj, which is very nice. It does mean that all of the split up pieces of…
bool Merger::maybeZero(unsigned e) const {		bool Merger::maybeZero(unsigned e) const {
if (tensorExps[e].kind == kInvariant) {		if (tensorExps[e].kind == kInvariant) {
if (auto c = tensorExps[e].val.getDefiningOp<arith::ConstantIntOp>())		if (auto c = tensorExps[e].val.getDefiningOp<arith::ConstantIntOp>())
return c.value() == 0;		return c.value() == 0;
if (auto c = tensorExps[e].val.getDefiningOp<arith::ConstantFloatOp>())		if (auto c = tensorExps[e].val.getDefiningOp<arith::ConstantFloatOp>())
return c.value().isZero();		return c.value().isZero();
}		}
return true;		return true;
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	if (x.hasValue()) {
if (isa<arith::ExtUIOp>(def))		if (isa<arith::ExtUIOp>(def))
return addExp(kCastU, e, v);		return addExp(kCastU, e, v);
if (isa<arith::IndexCastOp>(def))		if (isa<arith::IndexCastOp>(def))
return addExp(kCastIdx, e, v);		return addExp(kCastIdx, e, v);
if (isa<arith::TruncIOp>(def))		if (isa<arith::TruncIOp>(def))
return addExp(kTruncI, e, v);		return addExp(kTruncI, e, v);
if (isa<arith::BitcastOp>(def))		if (isa<arith::BitcastOp>(def))
return addExp(kBitCast, e, v);		return addExp(kBitCast, e, v);
		if (isa<sparse_tensor::UnaryOp>(def))
		aartbikUnsubmitted Done Reply Inline Actions At first reading, it was surprising to just see "UnaryOp" here, since that looks like just another arith:: version. Prefixing it with namespace sparse_tensor:: is of course against styleguide, but would look more clear. Perhaps we should rename the Unary/BinaryOp of sparse tensor dialect a bit more specific (can be done later). aartbik: At first reading, it was surprising to just see "UnaryOp" here, since that looks like just…
		aartbikUnsubmitted Done Reply Inline Actions braces not needed aartbik: braces not needed
		return addExp(kUnary, e, Value(), def);
}		}
}		}
// Construct binary operations if subexpressions can be built.		// Construct binary operations if subexpressions can be built.
// See buildLattices() for an explanation of rejecting certain		// See buildLattices() for an explanation of rejecting certain
// division and shift operations		// division and shift operations
if (def->getNumOperands() == 2) {		if (def->getNumOperands() == 2) {
auto x = buildTensorExp(op, def->getOperand(0));		auto x = buildTensorExp(op, def->getOperand(0));
auto y = buildTensorExp(op, def->getOperand(1));		auto y = buildTensorExp(op, def->getOperand(1));
Show All 25 Lines	if (x.hasValue() && y.hasValue()) {
if (isa<arith::XOrIOp>(def))		if (isa<arith::XOrIOp>(def))
return addExp(kXorI, e0, e1);		return addExp(kXorI, e0, e1);
if (isa<arith::ShRSIOp>(def) && isInvariant(e1))		if (isa<arith::ShRSIOp>(def) && isInvariant(e1))
return addExp(kShrS, e0, e1);		return addExp(kShrS, e0, e1);
if (isa<arith::ShRUIOp>(def) && isInvariant(e1))		if (isa<arith::ShRUIOp>(def) && isInvariant(e1))
return addExp(kShrU, e0, e1);		return addExp(kShrU, e0, e1);
if (isa<arith::ShLIOp>(def) && isInvariant(e1))		if (isa<arith::ShLIOp>(def) && isInvariant(e1))
return addExp(kShlI, e0, e1);		return addExp(kShlI, e0, e1);
		if (isa<sparse_tensor::BinaryOp>(def))
		aartbikUnsubmitted Done Reply Inline Actions braces not needed aartbik: braces not needed
		return addExp(kBinary, e0, e1, Value(), def);
}		}
}		}
// Cannot build.		// Cannot build.
return None;		return None;
}		}

		Value insertYieldOp(PatternRewriter &rewriter, Location loc, Region &region,
		aartbikUnsubmitted Done Reply Inline Actions mark these new helpers as "static" since they are private to the file aartbik: mark these new helpers as "static" since they are private to the file
		ValueRange vals) {
		// Make a clone of overlap region.
		Region tmpRegion;
		BlockAndValueMapping mapper;
		region.cloneInto(&tmpRegion, tmpRegion.begin(), mapper);
		Block &clonedBlock = tmpRegion.front();
		YieldOp clonedYield = dyn_cast<YieldOp>(clonedBlock.getTerminator());
		aartbikUnsubmitted Done Reply Inline Actions cast, we assume that verifier has filtered out bad cases aartbik: cast, we assume that verifier has filtered out bad cases
		// Merge cloned block and return yield value.
		Operation *placeholder = rewriter.create<arith::ConstantIndexOp>(loc, 0);
		rewriter.mergeBlockBefore(&tmpRegion.front(), placeholder, vals);
		Value val = clonedYield.result();
		rewriter.eraseOp(clonedYield);
		rewriter.eraseOp(placeholder);
		return val;
		}

Value Merger::buildExp(PatternRewriter &rewriter, Location loc, unsigned e,		Value Merger::buildExp(PatternRewriter &rewriter, Location loc, unsigned e,
Value v0, Value v1) {		Value v0, Value v1) {
switch (tensorExps[e].kind) {		switch (tensorExps[e].kind) {
case kTensor:		case kTensor:
case kInvariant:		case kInvariant:
case kIndex:		case kIndex:
llvm_unreachable("unexpected non-op");		llvm_unreachable("unexpected non-op");
// Unary ops.		// Unary ops.
Show All 30 Lines	Value Merger::buildExp(PatternRewriter &rewriter, Location loc, unsigned e,
case kCastIdx:		case kCastIdx:
return rewriter.create<arith::IndexCastOp>(loc, inferType(e, v0), v0);		return rewriter.create<arith::IndexCastOp>(loc, inferType(e, v0), v0);
case kTruncI:		case kTruncI:
return rewriter.create<arith::TruncIOp>(loc, inferType(e, v0), v0);		return rewriter.create<arith::TruncIOp>(loc, inferType(e, v0), v0);
case kBitCast:		case kBitCast:
return rewriter.create<arith::BitcastOp>(loc, inferType(e, v0), v0);		return rewriter.create<arith::BitcastOp>(loc, inferType(e, v0), v0);
// Binary ops.		// Binary ops.
case kMulF:		case kMulF:
return rewriter.create<arith::MulFOp>(loc, v0, v1);		return rewriter.create<arith::MulFOp>(loc, v0, v1);
case kMulI:		case kMulI:
		aartbikUnsubmitted Done Reply Inline Actions what does it mean if we hit this part? an empty block? or an error? aartbik: what does it mean if we hit this part? an empty block? or an error?
		jim22kAuthorUnsubmitted Done Reply Inline Actions This means an empty block, so I want to indicate no value. jim22k: This means an empty block, so I want to indicate no value.
return rewriter.create<arith::MulIOp>(loc, v0, v1);		return rewriter.create<arith::MulIOp>(loc, v0, v1);
case kDivF:		case kDivF:
return rewriter.create<arith::DivFOp>(loc, v0, v1);		return rewriter.create<arith::DivFOp>(loc, v0, v1);
case kDivS:		case kDivS:
return rewriter.create<arith::DivSIOp>(loc, v0, v1);		return rewriter.create<arith::DivSIOp>(loc, v0, v1);
case kDivU:		case kDivU:
return rewriter.create<arith::DivUIOp>(loc, v0, v1);		return rewriter.create<arith::DivUIOp>(loc, v0, v1);
case kAddF:		case kAddF:
Show All 11 Lines	Value Merger::buildExp(PatternRewriter &rewriter, Location loc, unsigned e,
case kXorI:		case kXorI:
return rewriter.create<arith::XOrIOp>(loc, v0, v1);		return rewriter.create<arith::XOrIOp>(loc, v0, v1);
case kShrS:		case kShrS:
return rewriter.create<arith::ShRSIOp>(loc, v0, v1);		return rewriter.create<arith::ShRSIOp>(loc, v0, v1);
case kShrU:		case kShrU:
return rewriter.create<arith::ShRUIOp>(loc, v0, v1);		return rewriter.create<arith::ShRUIOp>(loc, v0, v1);
case kShlI:		case kShlI:
return rewriter.create<arith::ShLIOp>(loc, v0, v1);		return rewriter.create<arith::ShLIOp>(loc, v0, v1);
		// Set-like ops with custom logic.
		case kUnary: {
		if (!v0)
		jim22kAuthorUnsubmitted Done Reply Inline Actions These can be combined in the switch because their logic is identical. The binary operation splits up into pieces which may have 1 or 2 input arguments, so they look just like the unary operation being split up. jim22k: These can be combined in the switch because their logic is identical. The binary operation…
		// Empty input value must be propagated.
		aartbikUnsubmitted Done Reply Inline Actions The unary and binary codegen blocks are a bit too large for this context (most others are oneliners) so please move into own method aartbik: The unary and binary codegen blocks are a bit too large for this context (most others are…
		return Value();
		UnaryOp unop = dyn_cast<UnaryOp>(tensorExps[e].op);
		assert(unop);
		aartbikUnsubmitted Done Reply Inline Actions period at end aartbik: period at end
		Region &presentRegion = unop.presentRegion();
		if (presentRegion.empty())
		// Uninitialized Value() will be interpreted as missing data in the
		// output.
		return Value();
		return insertYieldOp(rewriter, loc, presentRegion, {v0});
		}
		aartbikUnsubmitted Done Reply Inline Actions period at end aartbik: period at end
		case kBinaryHalf: {
		Operation *op = tensorExps[e].op;
		assert(op);
		aartbikUnsubmitted Done Reply Inline Actions here and below, document this shortcut returns aartbik: here and below, document this shortcut returns
		return insertYieldOp(rewriter, loc, *op->getBlock()->getParent(), {v0});
		}
		case kBinary: {
		if (!v0 \|\| !v1)
		// Empty input values must be propagated.
		return Value();
		BinaryOp binop = dyn_cast<BinaryOp>(tensorExps[e].op);
		assert(binop);
		Region &overlapRegion = binop.overlapRegion();
		if (overlapRegion.empty())
		// Uninitialized Value() will be interpreted as missing data in the
		// output.
		return Value();
		return insertYieldOp(rewriter, loc, overlapRegion, {v0, v1});
		}
}		}
llvm_unreachable("unexpected expression kind in build");		llvm_unreachable("unexpected expression kind in build");
}		}

} // namespace sparse_tensor		} // namespace sparse_tensor
} // namespace mlir		} // namespace mlir

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir

This file was added.

				// RUN: mlir-opt %s --sparse-compiler \| \
				// RUN: mlir-cpu-runner \
				// RUN: -e entry -entry-point-result=void \
				// RUN: -shared-libs=%mlir_integration_test_dir/libmlir_c_runner_utils%shlibext \| \
				// RUN: FileCheck %s

				#SparseVector = #sparse_tensor.encoding<{dimLevelType = ["compressed"]}>
				#DCSR = #sparse_tensor.encoding<{dimLevelType = ["compressed", "compressed"]}>

				//
				// Traits for tensor operations.
				//
				#trait_vec_scale = {
				indexing_maps = [
				affine_map<(i) -> (i)>, // a (in)
				affine_map<(i) -> (i)> // x (out)
				],
				iterator_types = ["parallel"]
				}
				#trait_vec_op = {
				indexing_maps = [
				affine_map<(i) -> (i)>, // a (in)
				affine_map<(i) -> (i)>, // b (in)
				affine_map<(i) -> (i)> // x (out)
				],
				iterator_types = ["parallel"]
				}
				#trait_mat_op = {
				indexing_maps = [
				affine_map<(i,j) -> (i,j)>, // A (in)
				affine_map<(i,j) -> (i,j)>, // B (in)
				affine_map<(i,j) -> (i,j)> // X (out)
				],
				iterator_types = ["parallel", "parallel"],
				doc = "X(i,j) = A(i,j) OP B(i,j)"
				}

				module {
				// Creates a new sparse vector using the minimum values from two input sparse vectors.
				// When there is no overlap, include the present value in the output.
				func @vector_min(%arga: tensor<?xf64, #SparseVector>,
				%argb: tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector> {
				%c = arith.constant 0 : index
				aartbikUnsubmitted Done Reply Inline Actions Here, and probably for the unary too, scan we add a case where one of the operands is dense (either truly dense, or annotated with "dense"). Just to verify that a lattice with "universal" index is properly tested. aartbik: Here, and probably for the unary too, scan we add a case where one of the operands is dense…
				%d = tensor.dim %arga, %c : tensor<?xf64, #SparseVector>
				%xv = sparse_tensor.init [%d] : tensor<?xf64, #SparseVector>
				%0 = linalg.generic #trait_vec_op
				ins(%arga, %argb: tensor<?xf64, #SparseVector>, tensor<?xf64, #SparseVector>)
				outs(%xv: tensor<?xf64, #SparseVector>) {
				^bb(%a: f64, %b: f64, %x: f64):
				%1 = sparse_tensor.binary %a, %b : f64, f64 to f64
				overlap={
				^bb0(%a0: f64, %b0: f64):
				%cmp = arith.cmpf "olt", %a0, %b0 : f64
				%2 = arith.select %cmp, %a0, %b0: f64
				sparse_tensor.yield %2 : f64
				}
				left=identity
				right=identity
				linalg.yield %1 : f64
				} -> tensor<?xf64, #SparseVector>
				return %0 : tensor<?xf64, #SparseVector>
				}

				// Creates a new sparse vector by multiplying a sparse vector with a dense vector.
				// When there is no overlap, leave the result empty.
				func @vector_mul(%arga: tensor<?xf64, #SparseVector>,
				%argb: tensor<?xf64>) -> tensor<?xf64, #SparseVector> {
				%c = arith.constant 0 : index
				%d = tensor.dim %arga, %c : tensor<?xf64, #SparseVector>
				%xv = sparse_tensor.init [%d] : tensor<?xf64, #SparseVector>
				%0 = linalg.generic #trait_vec_op
				ins(%arga, %argb: tensor<?xf64, #SparseVector>, tensor<?xf64>)
				outs(%xv: tensor<?xf64, #SparseVector>) {
				^bb(%a: f64, %b: f64, %x: f64):
				%1 = sparse_tensor.binary %a, %b : f64, f64 to f64
				overlap={
				^bb0(%a0: f64, %b0: f64):
				%ret = arith.mulf %a0, %b0 : f64
				sparse_tensor.yield %ret : f64
				}
				left={}
				right={}
				linalg.yield %1 : f64
				} -> tensor<?xf64, #SparseVector>
				return %0 : tensor<?xf64, #SparseVector>
				}

				// Take a set difference of two sparse vectors. The result will include only those
				// sparse elements present in the first, but not the second vector.
				func @vector_setdiff(%arga: tensor<?xf64, #SparseVector>,
				%argb: tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector> {
				%c = arith.constant 0 : index
				%d = tensor.dim %arga, %c : tensor<?xf64, #SparseVector>
				%xv = sparse_tensor.init [%d] : tensor<?xf64, #SparseVector>
				%0 = linalg.generic #trait_vec_op
				ins(%arga, %argb: tensor<?xf64, #SparseVector>, tensor<?xf64, #SparseVector>)
				outs(%xv: tensor<?xf64, #SparseVector>) {
				^bb(%a: f64, %b: f64, %x: f64):
				%1 = sparse_tensor.binary %a, %b : f64, f64 to f64
				overlap={}
				left=identity
				right={}
				linalg.yield %1 : f64
				} -> tensor<?xf64, #SparseVector>
				return %0 : tensor<?xf64, #SparseVector>
				}

				// Return the index of each entry
				func @vector_index(%arga: tensor<?xf64, #SparseVector>) -> tensor<?xi32, #SparseVector> {
				%c = arith.constant 0 : index
				%d = tensor.dim %arga, %c : tensor<?xf64, #SparseVector>
				%xv = sparse_tensor.init [%d] : tensor<?xi32, #SparseVector>
				%0 = linalg.generic #trait_vec_scale
				ins(%arga: tensor<?xf64, #SparseVector>)
				outs(%xv: tensor<?xi32, #SparseVector>) {
				^bb(%a: f64, %x: i32):
				%idx = linalg.index 0 : index
				%1 = sparse_tensor.binary %a, %idx : f64, index to i32
				overlap={
				^bb0(%x0: f64, %i: index):
				%ret = arith.index_cast %i : index to i32
				sparse_tensor.yield %ret : i32
				}
				left={}
				right={}
				linalg.yield %1 : i32
				} -> tensor<?xi32, #SparseVector>
				return %0 : tensor<?xi32, #SparseVector>
				}

				// Adds two sparse matrices when they intersect. Where they don't intersect,
				// negate the 2nd argument's values; ignore 1st argument-only values.
				func @matrix_intersect(%arga: tensor<?x?xf64, #DCSR>,
				%argb: tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR> {
				%c0 = arith.constant 0 : index
				%c1 = arith.constant 1 : index
				%d0 = tensor.dim %arga, %c0 : tensor<?x?xf64, #DCSR>
				%d1 = tensor.dim %arga, %c1 : tensor<?x?xf64, #DCSR>
				%xv = sparse_tensor.init [%d0, %d1] : tensor<?x?xf64, #DCSR>
				%0 = linalg.generic #trait_mat_op
				ins(%arga, %argb: tensor<?x?xf64, #DCSR>, tensor<?x?xf64, #DCSR>)
				outs(%xv: tensor<?x?xf64, #DCSR>) {
				^bb(%a: f64, %b: f64, %x: f64):
				%1 = sparse_tensor.binary %a, %b: f64, f64 to f64
				overlap={
				^bb0(%x0: f64, %y0: f64):
				%ret = arith.addf %x0, %y0 : f64
				sparse_tensor.yield %ret : f64
				}
				left={}
				right={
				^bb0(%x1: f64):
				%lret = arith.negf %x1 : f64
				sparse_tensor.yield %lret : f64
				}
				linalg.yield %1 : f64
				} -> tensor<?x?xf64, #DCSR>
				return %0 : tensor<?x?xf64, #DCSR>
				}

				// Dumps a sparse vector of type f64.
				func @dump_vec(%arg0: tensor<?xf64, #SparseVector>) {
				// Dump the values array to verify only sparse contents are stored.
				%c0 = arith.constant 0 : index
				%d0 = arith.constant -1.0 : f64
				%0 = sparse_tensor.values %arg0 : tensor<?xf64, #SparseVector> to memref<?xf64>
				%1 = vector.transfer_read %0[%c0], %d0: memref<?xf64>, vector<16xf64>
				vector.print %1 : vector<16xf64>
				// Dump the dense vector to verify structure is correct.
				%dv = sparse_tensor.convert %arg0 : tensor<?xf64, #SparseVector> to tensor<?xf64>
				%2 = bufferization.to_memref %dv : memref<?xf64>
				%3 = vector.transfer_read %2[%c0], %d0: memref<?xf64>, vector<32xf64>
				vector.print %3 : vector<32xf64>
				memref.dealloc %2 : memref<?xf64>
				return
				}

				// Dumps a sparse vector of type i32.
				func @dump_vec_i32(%arg0: tensor<?xi32, #SparseVector>) {
				// Dump the values array to verify only sparse contents are stored.
				%c0 = arith.constant 0 : index
				%d0 = arith.constant -1 : i32
				%0 = sparse_tensor.values %arg0 : tensor<?xi32, #SparseVector> to memref<?xi32>
				%1 = vector.transfer_read %0[%c0], %d0: memref<?xi32>, vector<24xi32>
				vector.print %1 : vector<24xi32>
				// Dump the dense vector to verify structure is correct.
				%dv = sparse_tensor.convert %arg0 : tensor<?xi32, #SparseVector> to tensor<?xi32>
				%2 = bufferization.to_memref %dv : memref<?xi32>
				%3 = vector.transfer_read %2[%c0], %d0: memref<?xi32>, vector<32xi32>
				vector.print %3 : vector<32xi32>
				memref.dealloc %2 : memref<?xi32>
				return
				}

				// Dump a sparse matrix.
				func @dump_mat(%arg0: tensor<?x?xf64, #DCSR>) {
				%d0 = arith.constant 0.0 : f64
				%c0 = arith.constant 0 : index
				%dm = sparse_tensor.convert %arg0 : tensor<?x?xf64, #DCSR> to tensor<?x?xf64>
				%0 = bufferization.to_memref %dm : memref<?x?xf64>
				%1 = vector.transfer_read %0[%c0, %c0], %d0: memref<?x?xf64>, vector<4x8xf64>
				vector.print %1 : vector<4x8xf64>
				memref.dealloc %0 : memref<?x?xf64>
				return
				}

				// Driver method to call and verify vector kernels.
				func @entry() {
				%c0 = arith.constant 0 : index
				aartbikUnsubmitted Done Reply Inline Actions looks like these two are never release (thus will fail memsan) aartbik: looks like these two are never release (thus will fail memsan)

				// Setup sparse vectors.
				%v1 = arith.constant sparse<
				[ [0], [3], [11], [17], [20], [21], [28], [29], [31] ],
				[ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0 ]
				> : tensor<32xf64>
				%v2 = arith.constant sparse<
				[ [1], [3], [4], [10], [16], [18], [21], [28], [29], [31] ],
				[11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0 ]
				> : tensor<32xf64>
				%v3 = arith.constant dense<
				[0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.,
				0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 0., 1.]
				> : tensor<32xf64>
				%sv1 = sparse_tensor.convert %v1 : tensor<32xf64> to tensor<?xf64, #SparseVector>
				%sv2 = sparse_tensor.convert %v2 : tensor<32xf64> to tensor<?xf64, #SparseVector>
				%dv3 = tensor.cast %v3 : tensor<32xf64> to tensor<?xf64>

				// Setup sparse matrices.
				%m1 = arith.constant sparse<
				[ [0,0], [0,1], [1,7], [2,2], [2,4], [2,7], [3,0], [3,2], [3,3] ],
				[ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0 ]
				> : tensor<4x8xf64>
				%m2 = arith.constant sparse<
				[ [0,0], [0,7], [1,0], [1,6], [2,1], [2,7] ],
				[6.0, 5.0, 4.0, 3.0, 2.0, 1.0 ]
				> : tensor<4x8xf64>
				%sm1 = sparse_tensor.convert %m1 : tensor<4x8xf64> to tensor<?x?xf64, #DCSR>
				%sm2 = sparse_tensor.convert %m2 : tensor<4x8xf64> to tensor<?x?xf64, #DCSR>

				// Call sparse vector kernels.
				%0 = call @vector_min(%sv1, %sv2)
				: (tensor<?xf64, #SparseVector>,
				tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector>
				%1 = call @vector_mul(%sv1, %dv3)
				: (tensor<?xf64, #SparseVector>,
				tensor<?xf64>) -> tensor<?xf64, #SparseVector>
				%2 = call @vector_setdiff(%sv1, %sv2)
				: (tensor<?xf64, #SparseVector>,
				tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector>
				%3 = call @vector_index(%sv1)
				: (tensor<?xf64, #SparseVector>) -> tensor<?xi32, #SparseVector>

				// Call sparse matrix kernels.
				%5 = call @matrix_intersect(%sm1, %sm2)
				: (tensor<?x?xf64, #DCSR>, tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>

				//
				// Verify the results.
				//
				// CHECK: ( 1, 2, 3, 4, 5, 6, 7, 8, 9, -1, -1, -1, -1, -1, -1, -1 )
				// CHECK-NEXT: ( 1, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 0, 0, 5, 6, 0, 0, 0, 0, 0, 0, 7, 8, 0, 9 )
				// CHECK-NEXT: ( 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, -1, -1, -1, -1, -1, -1 )
				// CHECK-NEXT: ( 0, 11, 0, 12, 13, 0, 0, 0, 0, 0, 14, 0, 0, 0, 0, 0, 15, 0, 16, 0, 0, 17, 0, 0, 0, 0, 0, 0, 18, 19, 0, 20 )
				// CHECK-NEXT: ( 1, 11, 2, 13, 14, 3, 15, 4, 16, 5, 6, 7, 8, 9, -1, -1 )
				// CHECK-NEXT: ( 1, 11, 0, 2, 13, 0, 0, 0, 0, 0, 14, 3, 0, 0, 0, 0, 15, 4, 16, 0, 5, 6, 0, 0, 0, 0, 0, 0, 7, 8, 0, 9 )
				// CHECK-NEXT: ( 0, 6, 3, 28, 0, 6, 56, 72, 9, -1, -1, -1, -1, -1, -1, -1 )
				// CHECK-NEXT: ( 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 28, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 56, 72, 0, 9 )
				// CHECK-NEXT: ( 1, 3, 4, 5, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 )
				// CHECK-NEXT: ( 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 )
				// CHECK-NEXT: ( 0, 3, 11, 17, 20, 21, 28, 29, 31, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 )
				// CHECK-NEXT: ( 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 11, 0, 0, 0, 0, 0, 17, 0, 0, 20, 21, 0, 0, 0, 0, 0, 0, 28, 29, 0, 31 )
				// CHECK-NEXT: ( ( 7, 0, 0, 0, 0, 0, 0, -5 ), ( -4, 0, 0, 0, 0, 0, -3, 0 ), ( 0, -2, 0, 0, 0, 0, 0, 7 ), ( 0, 0, 0, 0, 0, 0, 0, 0 ) )
				//
				call @dump_vec(%sv1) : (tensor<?xf64, #SparseVector>) -> ()
				call @dump_vec(%sv2) : (tensor<?xf64, #SparseVector>) -> ()
				call @dump_vec(%0) : (tensor<?xf64, #SparseVector>) -> ()
				call @dump_vec(%1) : (tensor<?xf64, #SparseVector>) -> ()
				call @dump_vec(%2) : (tensor<?xf64, #SparseVector>) -> ()
				call @dump_vec_i32(%3) : (tensor<?xi32, #SparseVector>) -> ()
				call @dump_mat(%5) : (tensor<?x?xf64, #DCSR>) -> ()

				// Release the resources.
				sparse_tensor.release %sv1 : tensor<?xf64, #SparseVector>
				sparse_tensor.release %sv2 : tensor<?xf64, #SparseVector>
				sparse_tensor.release %sm1 : tensor<?x?xf64, #DCSR>
				sparse_tensor.release %sm2 : tensor<?x?xf64, #DCSR>
				sparse_tensor.release %0 : tensor<?xf64, #SparseVector>
				sparse_tensor.release %1 : tensor<?xf64, #SparseVector>
				sparse_tensor.release %2 : tensor<?xf64, #SparseVector>
				sparse_tensor.release %3 : tensor<?xi32, #SparseVector>
				sparse_tensor.release %5 : tensor<?x?xf64, #DCSR>
				return
				}
				}
				No newline at end of file

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	%0 = linalg.generic #trait_scale
outs(%xm: tensor<?x?xf64, #DCSR>) {		outs(%xm: tensor<?x?xf64, #DCSR>) {
^bb(%a: f64, %x: f64):		^bb(%a: f64, %x: f64):
%1 = arith.mulf %a, %s : f64		%1 = arith.mulf %a, %s : f64
linalg.yield %1 : f64		linalg.yield %1 : f64
} -> tensor<?x?xf64, #DCSR>		} -> tensor<?x?xf64, #DCSR>
return %0 : tensor<?x?xf64, #DCSR>		return %0 : tensor<?x?xf64, #DCSR>
}		}

// Scales a sparse matrix in place.		// Scales a sparse matrix in place.
		aartbikUnsubmitted Done Reply Inline Actions Period at end aartbik: Period at end
func @matrix_scale_inplace(%argx: tensor<?x?xf64, #DCSR>		func @matrix_scale_inplace(%argx: tensor<?x?xf64, #DCSR>
		aartbikUnsubmitted Done Reply Inline Actions Rather than adding and modifying an existing test, I would much rather that you add two new integration tests: unary and binary. Yes, it has some boiler plate to setup inputs, but that way it is more clear where the semi-ring stuff is tested. aartbik: Rather than adding and modifying an existing test, I would much rather that you add two new…
{linalg.inplaceable = true}) -> tensor<?x?xf64, #DCSR> {		{linalg.inplaceable = true}) -> tensor<?x?xf64, #DCSR> {
%s = arith.constant 2.0 : f64		%s = arith.constant 2.0 : f64
%0 = linalg.generic #trait_scale_inpl		%0 = linalg.generic #trait_scale_inpl
outs(%argx: tensor<?x?xf64, #DCSR>) {		outs(%argx: tensor<?x?xf64, #DCSR>) {
^bb(%x: f64):		^bb(%x: f64):
%1 = arith.mulf %x, %s : f64		%1 = arith.mulf %x, %s : f64
linalg.yield %1 : f64		linalg.yield %1 : f64
} -> tensor<?x?xf64, #DCSR>		} -> tensor<?x?xf64, #DCSR>
Show All 33 Lines	%0 = linalg.generic #trait_op
%1 = arith.mulf %a, %b : f64		%1 = arith.mulf %a, %b : f64
linalg.yield %1 : f64		linalg.yield %1 : f64
} -> tensor<?x?xf64, #DCSR>		} -> tensor<?x?xf64, #DCSR>
return %0 : tensor<?x?xf64, #DCSR>		return %0 : tensor<?x?xf64, #DCSR>
}		}

// Dump a sparse matrix.		// Dump a sparse matrix.
func @dump(%arg0: tensor<?x?xf64, #DCSR>) {		func @dump(%arg0: tensor<?x?xf64, #DCSR>) {
%d0 = arith.constant 0.0 : f64		%d0 = arith.constant 0.0 : f64
		jim22kAuthorUnsubmitted Done Reply Inline Actions This test fails with an assertion error: `Assertion failed: (!add \|\| latGT(p0, p1)), function optimizeSet, file Merger.cpp, line 167.` jim22k: This test fails with an assertion error: `Assertion failed: (!add \|\| latGT(p0, p1)), function…
		jim22kAuthorUnsubmitted Done Reply Inline Actions This is now passing. jim22k: This is now passing.
%c0 = arith.constant 0 : index		%c0 = arith.constant 0 : index
%dm = sparse_tensor.convert %arg0 : tensor<?x?xf64, #DCSR> to tensor<?x?xf64>		%dm = sparse_tensor.convert %arg0 : tensor<?x?xf64, #DCSR> to tensor<?x?xf64>
%0 = bufferization.to_memref %dm : memref<?x?xf64>		%0 = bufferization.to_memref %dm : memref<?x?xf64>
%1 = vector.transfer_read %0[%c0, %c0], %d0: memref<?x?xf64>, vector<4x8xf64>		%1 = vector.transfer_read %0[%c0, %c0], %d0: memref<?x?xf64>, vector<4x8xf64>
vector.print %1 : vector<4x8xf64>		vector.print %1 : vector<4x8xf64>
memref.dealloc %0 : memref<?x?xf64>		memref.dealloc %0 : memref<?x?xf64>
return		return
}		}
Show All 10 Lines	func @entry() {
> : tensor<4x8xf64>		> : tensor<4x8xf64>
%m2 = arith.constant sparse<		%m2 = arith.constant sparse<
[ [0,0], [0,7], [1,0], [1,6], [2,1], [2,7] ],		[ [0,0], [0,7], [1,0], [1,6], [2,1], [2,7] ],
[6.0, 5.0, 4.0, 3.0, 2.0, 1.0 ]		[6.0, 5.0, 4.0, 3.0, 2.0, 1.0 ]
> : tensor<4x8xf64>		> : tensor<4x8xf64>
%sm1 = sparse_tensor.convert %m1 : tensor<4x8xf64> to tensor<?x?xf64, #DCSR>		%sm1 = sparse_tensor.convert %m1 : tensor<4x8xf64> to tensor<?x?xf64, #DCSR>
%sm2 = sparse_tensor.convert %m2 : tensor<4x8xf64> to tensor<?x?xf64, #DCSR>		%sm2 = sparse_tensor.convert %m2 : tensor<4x8xf64> to tensor<?x?xf64, #DCSR>

// Call sparse vector kernels.		// Call sparse matrix kernels.
%0 = call @matrix_scale(%sm1)		%0 = call @matrix_scale(%sm1)
: (tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>		: (tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>
%1 = call @matrix_scale_inplace(%sm1)		%1 = call @matrix_scale_inplace(%sm1)
: (tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>		: (tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>
%2 = call @matrix_add(%sm1, %sm2)		%2 = call @matrix_add(%sm1, %sm2)
: (tensor<?x?xf64, #DCSR>, tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>		: (tensor<?x?xf64, #DCSR>, tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>
%3 = call @matrix_mul(%sm1, %sm2)		%3 = call @matrix_mul(%sm1, %sm2)
: (tensor<?x?xf64, #DCSR>, tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>		: (tensor<?x?xf64, #DCSR>, tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>
Show All 27 Lines

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir

This file was added.

				// RUN: mlir-opt %s --sparse-compiler \| \
				// RUN: mlir-cpu-runner \
				// RUN: -e entry -entry-point-result=void \
				// RUN: -shared-libs=%mlir_integration_test_dir/libmlir_c_runner_utils%shlibext \| \
				// RUN: FileCheck %s

				#SparseVector = #sparse_tensor.encoding<{dimLevelType = ["compressed"]}>
				#DCSR = #sparse_tensor.encoding<{dimLevelType = ["compressed", "compressed"]}>

				//
				// Traits for tensor operations.
				//
				#trait_vec_scale = {
				indexing_maps = [
				affine_map<(i) -> (i)>, // a (in)
				affine_map<(i) -> (i)> // x (out)
				],
				iterator_types = ["parallel"]
				}
				#trait_mat_scale = {
				indexing_maps = [
				affine_map<(i,j) -> (i,j)>, // A (in)
				affine_map<(i,j) -> (i,j)> // X (out)
				],
				iterator_types = ["parallel", "parallel"]
				}

				module {
				// Invert the structure of a sparse vector. Present values become missing.
				// Missing values are filled with 1 (i32).
				func @vector_complement(%arga: tensor<?xf64, #SparseVector>) -> tensor<?xi32, #SparseVector> {
				%c = arith.constant 0 : index
				%ci1 = arith.constant 1 : i32
				%d = tensor.dim %arga, %c : tensor<?xf64, #SparseVector>
				%xv = sparse_tensor.init [%d] : tensor<?xi32, #SparseVector>
				%0 = linalg.generic #trait_vec_scale
				ins(%arga: tensor<?xf64, #SparseVector>)
				outs(%xv: tensor<?xi32, #SparseVector>) {
				^bb(%a: f64, %x: i32):
				%1 = sparse_tensor.unary %a : f64 to i32
				present={}
				aartbikUnsubmitted Done Reply Inline Actions You have a test with absent, and one with present. How about also testing when both are set, just for completeness? Like your %result = sparse_tensor.unary %a : f64 to i32 present={ ^bb0(%x: f64): %ret = arith.constant 1 : i32 sparse_tensor.yield %ret : i32 } absent={ %ret = arith.constant -1 : i32 sparse_tensor.yield %ret : i32 } example? In fact, even though you test the present case with a matrix, how about also doing that for a vector, just so that all cases are covered for a single loop example. That way, the code also acts like a nice illustration of the feature aartbik: You have a test with absent, and one with present. How about also testing when both are set…
				absent={
				sparse_tensor.yield %ci1 : i32
				}
				linalg.yield %1 : i32
				} -> tensor<?xi32, #SparseVector>
				return %0 : tensor<?xi32, #SparseVector>
				}

				// Negate existing values. Fill missing ones with +1.
				func @vector_negation(%arga: tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector> {
				%c = arith.constant 0 : index
				%cf1 = arith.constant 1.0 : f64
				%d = tensor.dim %arga, %c : tensor<?xf64, #SparseVector>
				%xv = sparse_tensor.init [%d] : tensor<?xf64, #SparseVector>
				%0 = linalg.generic #trait_vec_scale
				ins(%arga: tensor<?xf64, #SparseVector>)
				outs(%xv: tensor<?xf64, #SparseVector>) {
				^bb(%a: f64, %x: f64):
				%1 = sparse_tensor.unary %a : f64 to f64
				present={
				^bb0(%x0: f64):
				%ret = arith.negf %x0 : f64
				sparse_tensor.yield %ret : f64
				}
				absent={
				sparse_tensor.yield %cf1 : f64
				}
				linalg.yield %1 : f64
				} -> tensor<?xf64, #SparseVector>
				return %0 : tensor<?xf64, #SparseVector>
				}

				// Clips values to the range [3, 7].
				func @matrix_clip(%argx: tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR> {
				%c0 = arith.constant 0 : index
				%c1 = arith.constant 1 : index
				%cfmin = arith.constant 3.0 : f64
				%cfmax = arith.constant 7.0 : f64
				%d0 = tensor.dim %argx, %c0 : tensor<?x?xf64, #DCSR>
				%d1 = tensor.dim %argx, %c1 : tensor<?x?xf64, #DCSR>
				%xv = sparse_tensor.init [%d0, %d1] : tensor<?x?xf64, #DCSR>
				%0 = linalg.generic #trait_mat_scale
				ins(%argx: tensor<?x?xf64, #DCSR>)
				outs(%xv: tensor<?x?xf64, #DCSR>) {
				^bb(%a: f64, %x: f64):
				%1 = sparse_tensor.unary %a: f64 to f64
				present={
				^bb0(%x0: f64):
				%mincmp = arith.cmpf "ogt", %x0, %cfmin : f64
				%x1 = arith.select %mincmp, %x0, %cfmin : f64
				%maxcmp = arith.cmpf "olt", %x1, %cfmax : f64
				%x2 = arith.select %maxcmp, %x1, %cfmax : f64
				sparse_tensor.yield %x2 : f64
				}
				absent={}
				linalg.yield %1 : f64
				} -> tensor<?x?xf64, #DCSR>
				return %0 : tensor<?x?xf64, #DCSR>
				}

				// Dumps a sparse vector of type f64.
				func @dump_vec_f64(%arg0: tensor<?xf64, #SparseVector>) {
				// Dump the values array to verify only sparse contents are stored.
				%c0 = arith.constant 0 : index
				%d0 = arith.constant -1.0 : f64
				%0 = sparse_tensor.values %arg0 : tensor<?xf64, #SparseVector> to memref<?xf64>
				%1 = vector.transfer_read %0[%c0], %d0: memref<?xf64>, vector<32xf64>
				vector.print %1 : vector<32xf64>
				// Dump the dense vector to verify structure is correct.
				%dv = sparse_tensor.convert %arg0 : tensor<?xf64, #SparseVector> to tensor<?xf64>
				%2 = bufferization.to_memref %dv : memref<?xf64>
				%3 = vector.transfer_read %2[%c0], %d0: memref<?xf64>, vector<32xf64>
				aartbikUnsubmitted Done Reply Inline Actions indentation is off aartbik: indentation is off
				vector.print %3 : vector<32xf64>
				memref.dealloc %2 : memref<?xf64>
				return
				}

				// Dumps a sparse vector of type i32.
				func @dump_vec_i32(%arg0: tensor<?xi32, #SparseVector>) {
				aartbikUnsubmitted Done Reply Inline Actions do we want to do the same thing here? i.e. print values first (to see that we only compute sparse values) and then print full matrix for stucture? aartbik: do we want to do the same thing here? i.e. print values first (to see that we only compute…
				// Dump the values array to verify only sparse contents are stored.
				%c0 = arith.constant 0 : index
				%d0 = arith.constant -1 : i32
				%0 = sparse_tensor.values %arg0 : tensor<?xi32, #SparseVector> to memref<?xi32>
				%1 = vector.transfer_read %0[%c0], %d0: memref<?xi32>, vector<24xi32>
				vector.print %1 : vector<24xi32>
				// Dump the dense vector to verify structure is correct.
				%dv = sparse_tensor.convert %arg0 : tensor<?xi32, #SparseVector> to tensor<?xi32>
				%2 = bufferization.to_memref %dv : memref<?xi32>
				%3 = vector.transfer_read %2[%c0], %d0: memref<?xi32>, vector<32xi32>
				vector.print %3 : vector<32xi32>
				memref.dealloc %2 : memref<?xi32>
				return
				}

				// Dump a sparse matrix.
				func @dump_mat(%arg0: tensor<?x?xf64, #DCSR>) {
				%c0 = arith.constant 0 : index
				%d0 = arith.constant -1.0 : f64
				%0 = sparse_tensor.values %arg0 : tensor<?x?xf64, #DCSR> to memref<?xf64>
				%1 = vector.transfer_read %0[%c0], %d0: memref<?xf64>, vector<16xf64>
				aartbikUnsubmitted Done Reply Inline Actions This is never released (and will thus fail our memsan test). aartbik: This is never released (and will thus fail our memsan test).
				vector.print %1 : vector<16xf64>
				%dm = sparse_tensor.convert %arg0 : tensor<?x?xf64, #DCSR> to tensor<?x?xf64>
				%2 = bufferization.to_memref %dm : memref<?x?xf64>
				%3 = vector.transfer_read %2[%c0, %c0], %d0: memref<?x?xf64>, vector<4x8xf64>
				vector.print %3 : vector<4x8xf64>
				memref.dealloc %2 : memref<?x?xf64>
				return
				}

				// Driver method to call and verify vector kernels.
				func @entry() {
				%c0 = arith.constant 0 : index

				// Setup sparse vectors.
				%v1 = arith.constant sparse<
				[ [0], [3], [11], [17], [20], [21], [28], [29], [31] ],
				[ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0 ]
				> : tensor<32xf64>
				%sv1 = sparse_tensor.convert %v1 : tensor<32xf64> to tensor<?xf64, #SparseVector>

				// Setup sparse matrices.
				%m1 = arith.constant sparse<
				[ [0,0], [0,1], [1,7], [2,2], [2,4], [2,7], [3,0], [3,2], [3,3] ],
				[ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0 ]
				> : tensor<4x8xf64>
				%sm1 = sparse_tensor.convert %m1 : tensor<4x8xf64> to tensor<?x?xf64, #DCSR>

				// Call sparse vector kernels.
				%0 = call @vector_complement(%sv1)
				: (tensor<?xf64, #SparseVector>) -> tensor<?xi32, #SparseVector>
				%1 = call @vector_negation(%sv1)
				: (tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector>


				// Call sparse matrix kernels.
				%2 = call @matrix_clip(%sm1)
				: (tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>

				//
				// Verify the results.
				//
				// CHECK: ( 1, 2, 3, 4, 5, 6, 7, 8, 9, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 )
				// CHECK-NEXT: ( 1, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 0, 0, 5, 6, 0, 0, 0, 0, 0, 0, 7, 8, 0, 9 )
				// CHECK-NEXT: ( 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1 )
				// CHECK-NEXT: ( 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0 )
				// CHECK-NEXT: ( -1, 1, 1, -2, 1, 1, 1, 1, 1, 1, 1, -3, 1, 1, 1, 1, 1, -4, 1, 1, -5, -6, 1, 1, 1, 1, 1, 1, -7, -8, 1, -9 )
				// CHECK-NEXT: ( -1, 1, 1, -2, 1, 1, 1, 1, 1, 1, 1, -3, 1, 1, 1, 1, 1, -4, 1, 1, -5, -6, 1, 1, 1, 1, 1, 1, -7, -8, 1, -9 )
				// CHECK-NEXT: ( 3, 3, 3, 4, 5, 6, 7, 7, 7, -1, -1, -1, -1, -1, -1, -1 )
				// CHECK-NEXT: ( ( 3, 3, 0, 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0, 0, 0, 3 ), ( 0, 0, 4, 0, 5, 0, 0, 6 ), ( 7, 0, 7, 7, 0, 0, 0, 0 ) )
				//
				call @dump_vec_f64(%sv1) : (tensor<?xf64, #SparseVector>) -> ()
				call @dump_vec_i32(%0) : (tensor<?xi32, #SparseVector>) -> ()
				call @dump_vec_f64(%1) : (tensor<?xf64, #SparseVector>) -> ()
				call @dump_mat(%2) : (tensor<?x?xf64, #DCSR>) -> ()

				// Release the resources.
				sparse_tensor.release %sv1 : tensor<?xf64, #SparseVector>
				sparse_tensor.release %sm1 : tensor<?x?xf64, #DCSR>
				sparse_tensor.release %0 : tensor<?xi32, #SparseVector>
				sparse_tensor.release %1 : tensor<?xf64, #SparseVector>
				sparse_tensor.release %2 : tensor<?x?xf64, #DCSR>
				return
				}
				}
				No newline at end of file

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	%0 = linalg.generic #trait_op
outs(%xv: tensor<?xf64, #SparseVector>) {		outs(%xv: tensor<?xf64, #SparseVector>) {
^bb(%a: f64, %b: f64, %x: f64):		^bb(%a: f64, %b: f64, %x: f64):
%1 = arith.addf %a, %b : f64		%1 = arith.addf %a, %b : f64
linalg.yield %1 : f64		linalg.yield %1 : f64
} -> tensor<?xf64, #SparseVector>		} -> tensor<?xf64, #SparseVector>
return %0 : tensor<?xf64, #SparseVector>		return %0 : tensor<?xf64, #SparseVector>
}		}

// Multiplies two sparse vectors into a new sparse vector.		// Multiplies two sparse vectors into a new sparse vector.
		aartbikUnsubmitted Done Reply Inline Actions Same here. I would move these into the semi ring integration tests mentioned above, and simply add the novec/vec flags at the top (a test can have more than one RUN/CHECK series) aartbik: Same here. I would move these into the semi ring integration tests mentioned above, and simply…
func @vector_mul(%arga: tensor<?xf64, #SparseVector>,		func @vector_mul(%arga: tensor<?xf64, #SparseVector>,
%argb: tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector> {		%argb: tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector> {
		jim22kAuthorUnsubmitted Done Reply Inline Actions This test passes. jim22k: This test passes.
%c = arith.constant 0 : index		%c = arith.constant 0 : index
%d = tensor.dim %arga, %c : tensor<?xf64, #SparseVector>		%d = tensor.dim %arga, %c : tensor<?xf64, #SparseVector>
%xv = sparse_tensor.init [%d] : tensor<?xf64, #SparseVector>		%xv = sparse_tensor.init [%d] : tensor<?xf64, #SparseVector>
%0 = linalg.generic #trait_op		%0 = linalg.generic #trait_op
ins(%arga, %argb: tensor<?xf64, #SparseVector>, tensor<?xf64, #SparseVector>)		ins(%arga, %argb: tensor<?xf64, #SparseVector>, tensor<?xf64, #SparseVector>)
outs(%xv: tensor<?xf64, #SparseVector>) {		outs(%xv: tensor<?xf64, #SparseVector>) {
^bb(%a: f64, %b: f64, %x: f64):		^bb(%a: f64, %b: f64, %x: f64):
%1 = arith.mulf %a, %b : f64		%1 = arith.mulf %a, %b : f64
Show All 16 Lines	%0 = linalg.generic #trait_op
linalg.yield %1 : f64		linalg.yield %1 : f64
} -> tensor<?xf64, #DenseVector>		} -> tensor<?xf64, #DenseVector>
return %0 : tensor<?xf64, #DenseVector>		return %0 : tensor<?xf64, #DenseVector>
}		}

// Sum reduces dot product of two sparse vectors.		// Sum reduces dot product of two sparse vectors.
func @vector_dotprod(%arga: tensor<?xf64, #SparseVector>,		func @vector_dotprod(%arga: tensor<?xf64, #SparseVector>,
%argb: tensor<?xf64, #SparseVector>,		%argb: tensor<?xf64, #SparseVector>,
%argx: tensor<f64> {linalg.inplaceable = true}) -> tensor<f64> {		%argx: tensor<f64> {linalg.inplaceable = true}) -> tensor<f64> {
%0 = linalg.generic #trait_dot		%0 = linalg.generic #trait_dot
ins(%arga, %argb: tensor<?xf64, #SparseVector>, tensor<?xf64, #SparseVector>)		ins(%arga, %argb: tensor<?xf64, #SparseVector>, tensor<?xf64, #SparseVector>)
outs(%argx: tensor<f64>) {		outs(%argx: tensor<f64>) {
^bb(%a: f64, %b: f64, %x: f64):		^bb(%a: f64, %b: f64, %x: f64):
%1 = arith.mulf %a, %b : f64		%1 = arith.mulf %a, %b : f64
%2 = arith.addf %x, %1 : f64		%2 = arith.addf %x, %1 : f64
linalg.yield %2 : f64		linalg.yield %2 : f64
} -> tensor<f64>		} -> tensor<f64>
return %0 : tensor<f64>		return %0 : tensor<f64>
}		}

// Dumps a sparse vector.		// Dumps a sparse vector of type f64.
func @dump(%arg0: tensor<?xf64, #SparseVector>) {		func @dump(%arg0: tensor<?xf64, #SparseVector>) {
// Dump the values array to verify only sparse contents are stored.		// Dump the values array to verify only sparse contents are stored.
		jim22kAuthorUnsubmitted Done Reply Inline Actions This test passes. jim22k: This test passes.
%c0 = arith.constant 0 : index		%c0 = arith.constant 0 : index
%d0 = arith.constant -1.0 : f64		%d0 = arith.constant -1.0 : f64
%0 = sparse_tensor.values %arg0 : tensor<?xf64, #SparseVector> to memref<?xf64>		%0 = sparse_tensor.values %arg0 : tensor<?xf64, #SparseVector> to memref<?xf64>
%1 = vector.transfer_read %0[%c0], %d0: memref<?xf64>, vector<16xf64>		%1 = vector.transfer_read %0[%c0], %d0: memref<?xf64>, vector<16xf64>
vector.print %1 : vector<16xf64>		vector.print %1 : vector<16xf64>
// Dump the dense vector to verify structure is correct.		// Dump the dense vector to verify structure is correct.
%dv = sparse_tensor.convert %arg0 : tensor<?xf64, #SparseVector> to tensor<?xf64>		%dv = sparse_tensor.convert %arg0 : tensor<?xf64, #SparseVector> to tensor<?xf64>
%2 = bufferization.to_memref %dv : memref<?xf64>		%2 = bufferization.to_memref %dv : memref<?xf64>
%3 = vector.transfer_read %2[%c0], %d0: memref<?xf64>, vector<32xf64>		%3 = vector.transfer_read %2[%c0], %d0: memref<?xf64>, vector<32xf64>
vector.print %3 : vector<32xf64>		vector.print %3 : vector<32xf64>
memref.dealloc %2 : memref<?xf64>		memref.dealloc %2 : memref<?xf64>
return		return
}		}

// Driver method to call and verify vector kernels.		// Driver method to call and verify vector kernels.
func @entry() {		func @entry() {
%c0 = arith.constant 0 : index		%c0 = arith.constant 0 : index
%d1 = arith.constant 1.1 : f64		%d1 = arith.constant 1.1 : f64

// Setup sparse vectors.		// Setup sparse vectors.
		jim22kAuthorUnsubmitted Done Reply Inline Actions This test fails with an assertion error: `Assertion failed: (!add \|\| latGT(p0, p1)), function optimizeSet, file Merger.cpp, line 167.` jim22k: This test fails with an assertion error: `Assertion failed: (!add \|\| latGT(p0, p1)), function…
		jim22kAuthorUnsubmitted Done Reply Inline Actions This test is now passing. jim22k: This test is now passing.
%v1 = arith.constant sparse<		%v1 = arith.constant sparse<
[ [0], [3], [11], [17], [20], [21], [28], [29], [31] ],		[ [0], [3], [11], [17], [20], [21], [28], [29], [31] ],
[ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0 ]		[ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0 ]
> : tensor<32xf64>		> : tensor<32xf64>
%v2 = arith.constant sparse<		%v2 = arith.constant sparse<
[ [1], [3], [4], [10], [16], [18], [21], [28], [29], [31] ],		[ [1], [3], [4], [10], [16], [18], [21], [28], [29], [31] ],
[11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0 ]		[11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0 ]
> : tensor<32xf64>		> : tensor<32xf64>
▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse] Lowering for unary and binaryClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 425306

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td

mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir

[mlir][sparse] Lowering for unary and binary
ClosedPublic