This is an archive of the discontinued LLVM Phabricator instance.

@aartbik Please take a look and let me know your thoughts on my general approach. I also indicated where I am stuck and could use some help, as I don't truly understand the lattice-set theory.

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
545	I need help here. I'm not sure what I need to include such that `absentVal` is added to the output for every missing value in the input. I assume this is something similar to kTensor or kInvariant, as doing `v+1` inside linalg.generic will create a dense output. I need something similar to happen here.
614	I'm stuck here. The `kBinary` below creates one kBinaryRegion and two kUnaryRegions. These are not called/handled for the vector tests. But for the matrix test, the kBinaryRegion is re-evaluated for some reason. The problem is that the origin `sparse_tensor.binary` operation has already been split apart. I essentially need a no-op here (i.e. it's already working fine. Don't re-evaluate). I tried calling `takeConj` again, but that somehow eliminates the disjoint pieces that were added in the kBinary section.
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir
135	This test fails with an assertion error: `Assertion failed: (!add \|\| latGT(p0, p1)), function optimizeSet, file Merger.cpp, line 167.`
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir
95	This test passes.
167	This test passes.
187	This test fails with an assertion error: `Assertion failed: (!add \|\| latGT(p0, p1)), function optimizeSet, file Merger.cpp, line 167.`

Harbormaster completed remote builds in B157780: Diff 420248.Apr 4 2022, 11:44 AM

jim22k retitled this revision from Lowering for unary and binary to [mlir][sparse] WIP -- Lowering for unary and binary.Apr 4 2022, 11:58 AM

jim22k edited the summary of this revision. (Show Details)

Herald added a subscriber: limo1996. · View Herald TranscriptApr 4 2022, 11:58 AM

aartbik added inline comments.Apr 13 2022, 11:44 AM

mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
50	At first reading, I expected the kUnaryRegion and kBinaryRegion, but was a bit surprised with the kUnary and kBinary (since that generalizes a lot of the ops already there). Can you briefly document what the intended differences are here? Scanning ahead, it seems like one is eventually broken up into several of the others, but I wonder if we can't just keep a single kUnaryRegion and kBinaryRegion, and let the op itself drive the codegen?
98	style: period at end in comment Also, originally I avoided linking this TensorExpr back to IR of code in MLIR, since during lattice construction, one must be careful to maintain the right 1:1 correspondence between the lattice point and the IR. Obviously, we have little choice for the new binary/unary regions, but let's document that here carefully.
158	ah, I see you solved the issue I alluded to above by passing this explicitly to the merge operations; let's brainstorm later if we can somehow do this using the child nodes only...
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
797–800	This is very surprising? Is this right?
1223	unnecessary format change?
1370	unnecessary format change?
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
25	this is not completely in style with the general setup: only assign the fields that are know to be set, assert on nullptr in all others
60–71	I think kUnary/BinaryRegion should be new cases that set op, all others assert !op
529	note that this could be a simple cast, not a dyn_cast, but more preferable, it looks like we could simply keep a single kUnaryRegion case, see below
545	This is indeed a bit trickier. For the time being, to focus on the overall logic, let's simply put this case under the same as all the other zero preserving unary ops, and assert that absent() is not set. Then we will enhance the logic later.
614	If you rewrite it as suggested below, I think the no-op comes natural.
631	same, should be a cast here, but it looks like we could simply keep a single kBinaryRegion case by looking at the result of this cast (which then needs to be dynamic again ;-) and decide what to do. I think I prefer that a bit more than artificially introducing the kBinary/kUnary as "handled" cases.
638	it feels this whole block of code, L612 to L639 is really takeDisj with some smart selection of the branches. Perhaps you can split out the analysis of the MLIR IR (getting the three branches), and then write a new takeDisj(... , opboth, opleft, opright) and put the takeDisj close to the other, just so that the actual lattic logic is not so deeply burried inside this huge block
751	At first reading, it was surprising to just see "UnaryOp" here, since that looks like just another arith:: version. Prefixing it with namespace sparse_tensor:: is of course against styleguide, but would look more clear. Perhaps we should rename the Unary/BinaryOp of sparse tensor dialect a bit more specific (can be done later).
850–851	what does it mean if we hit this part? an empty block? or an error?

jim22k added inline comments.Apr 13 2022, 5:13 PM

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
797–800	This is related to my handling of binary[overlap] and unary[present] when those regions are empty. If you look at Merger.cpp:617, it passes a nullptr for the `Operation*`. That, in turn, manifests in line 886 and returns `Value()` for the Value output. When that empty Value reaches this code, we skip the insertion. It works, but it does feel like a hack. Can you think of a better way to indicate to the generating code that no output is needed?
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
631	That seems reasonable. I will refactor and check the result of dyn_cast. If it fails, it means the BinaryOp has already been handled. `buildLattices` has unsigned return type, so what should I return in that case? This is where I want a no-op, meaning "don't add any new lattices points".
638	Good idea to move this up near the other takeDisj().
850–851	This means an empty block, so I want to indicate no value.

aartbik added inline comments.Apr 14 2022, 9:19 AM

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
797–800	It was unexpected, but am okay with this if (1) it is documented and (2) perhaps we can even add some form of assert when !rhs that verifies that indeed we hit an expected case
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
631	If at all possible, I would like the "nop" to be detected at a higher level, i.e. before building the new lattices. Otherwise we would have to somehow return the existing set id (returning a nop value is a bit too intrusive to my taste). But after you have done the restructuring suggested here, I will have another look.

Updates based on feedback

Create 2nd takeDisj method specifically for binary
Eliminate kBinaryRegion/kUnaryRegion in favor of no "handled" cases
Add better comments
Eliminate the double handling of kBinary by looking at the loop level

jim22k added inline comments.Apr 16 2022, 8:01 PM

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1223	clang-format made this change; not sure why
1370	same as above -- clang-format wanted this changed
1603	This is new and is my attempt to avoid calling kBinary twice so we avoid the need for a "no-op" return. For matrices, there are two loops, so this piece of code is called twice. If we split apart the binary operation, we run into the need for a no-op. By passing `topSort.size() - at`, we only split apart the binary in the lowest loop. Previous calls will be a simple takeConj with the binary operation passed unchanged.
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
64	This is a weaker assert, but is required because splitting up the binary result in some pieces which are fundamentally unary, so they don't have a `y`. In this iteration, I am still labeling all the split up chunks of binary as kBinary.
628	I don't know if this is the correct approach, but it makes the matrix tests all pass. We only split up the binary operation once and we never need to return a no-op. At least, that will hold as long as z==1 for the last time we try to buildLattices() on the binary operation.
656	I created the new takeDisj, which is very nice. It does mean that all of the split up pieces of kBinary remain labeled as kBinary, even for `left` and `right` which only have a single input argument.
881	These can be combined in the switch because their logic is identical. The binary operation splits up into pieces which may have 1 or 2 input arguments, so they look just like the unary operation being split up.
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir
135	This is now passing.

Harbormaster completed remote builds in B159958: Diff 423272.Apr 16 2022, 10:25 PM

Make the absent region of sparse_tensor.unary work

jim22k added inline comments.Apr 18 2022, 1:11 PM

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
545	I figured out how to make the `absent` region work. I essentially treat it like `v + 1` (which is a binary function). Using the same logic, I perform a disjunction. The overlap will be replaced by the `present` block which only has 1 input argument, so it will sub in the left argument rather than being truly binary. The rhs is a fixed value, so I create a kInvariant for it and it covers everything which is not a conjunction.
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir
187	This test is now passing.

Harbormaster completed remote builds in B160103: Diff 423458.Apr 18 2022, 1:40 PM

Merge latest from main

Harbormaster completed remote builds in B160229: Diff 423617.Apr 19 2022, 7:42 AM

Add test involving linalg.index

Harbormaster completed remote builds in B160302: Diff 423710.Apr 19 2022, 1:09 PM

@aartbik I think this PR is ready for a full review. It's passing all the tests.

I do need you to look at the if (z != 1) logic and see if that is reasonable. It works for the tests, but I would appreciate your perspective on where that might break.
I also want to discuss the Kind that is assigned when unary and binary are broken up. I currently have the Kind staying unchanged, but that limits what can be asserted in TensorExps constructor. It also makes other checks strange -- for example, mapSet checks that we only pass in unary operations, but I also have to allow kBinary because some of the split up chunks require mapSet.
You said you wanted to brainstorm how to pass the blocks around via the children rather than the merge operations directly. I'm open to that if we can figure out a way. Otherwise, I don't think the current implementation is bad.

Thanks for making such diligent progress with this and apologies for the delay. A few high prio tasks came up but from here on it should be a bit smoother sailing!

mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
98	Note that technically this field is only used for unary and binary, but putting this in a union or so seems too convoluted. But just add a comment on when this field is set.
158	I am okay with this for now (and perhaps permanently). It at least makes semantics at top level explicit.
247	"e" and "i" have some designated meaning, so please don't change into im. Here "z" can use a comment in the method description. But I am not sure about this approach, see below.
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
797–800	It is a bit strange to refer to "Kinds" here, just rephrase the comment in general terms. Anything we can assert on rhs or t if not set?
1603	This feels hacky. In the original iteration space theory, building the lattice sets solely depends on the current index, and it has nothing to do with the nesting depth or anything like that. In principle, we need to make this call at every loop nest to determine the remaining expressions.
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
62	assert on op at least
64	This feels hacky. I found these asserts very useful to ensure we are truly doing the right thing. If we introduce all sorts of intermediate stages, as in binary but marked unary etc, such important invariants are lost. Can we cleanup the "top level" logic that transforms binary into unary (this is e.g. also done around binary minus into unary minus)
67	&& op
151	period at end, here and below.
640	period at end
751	braces not needed
795	braces not needed
885	period at end
892	period at end
895	here and below, document this shortcut returns
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir
56	Period at end
57	Rather than adding and modifying an existing test, I would much rather that you add two new integration tests: unary and binary. Yes, it has some boiler plate to setup inputs, but that way it is more clear where the semi-ring stuff is tested.
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir
93	Same here. I would move these into the semi ring integration tests mentioned above, and simply add the novec/vec flags at the top (a test can have more than one RUN/CHECK series)

aartbik retitled this revision from [mlir][sparse] WIP -- Lowering for unary and binary to [mlir][sparse] Lowering for unary and binary.Apr 20 2022, 1:30 PM

jim22k added inline comments.Apr 21 2022, 1:57 PM

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1603	I agree. This was a quick and dirty approach, but I've convinced myself that it is not just hacky, but actually gives wrong results. I figured out another way which is giving better results and avoids the need for this hack. I will clean up the code and show it off in the next revision.

aartbik added inline comments.Apr 21 2022, 2:42 PM

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
1603	Looking forward to that. Thanks!

Change how repeat calls to unary and binary are handled

Remove z parameter from buildLattices
Add mergeOp to TensorExp
Move tests to their own files

Some things to notice with these changes:

There are now two Operation * in TensorExp. This lets us keep the original binary or unary operation as well as the yield operation to be merged after splitting it up into pieces. By keeping the original, it allows us to perform the same logic in further nested loops. In some of the split up pieces (disjoint left or right regions), we do not include the original op. This essentially signals that these pieces have been "handled". Only one region will contain the original and will continue building in the next level down.
There is one case where the Kind changes. kBinary keeps the conjoint piece as kBinary, but the disjoint pieces become kUnary. This lets us retain the check in mapSet that requires a unary kind. If you don't like this change of kind, I can change it. It only affects some of the asserts.

Harbormaster completed remote builds in B160921: Diff 424560.Apr 22 2022, 12:48 PM

aartbik added inline comments.Apr 25 2022, 4:00 PM

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447 ↗	(On Diff #424560)	Can you actually add a test of this, something like %0 = linalg.generic #trait_vec_op ins(%arga, %argb: tensor<?xi64, #SparseVector>, tensor<?xi64, #SparseVector>) outs(%xv: tensor<?xi64, #SparseVector>) { ^bb(%a: i64, %b: i64, %x: i64): %idx = linalg.index 0 : index %cst = arith.index_cast %idx : index to i64 %1 = sparse_tensor.binary %a, %b : i64, i64 to i64 overlap={ ^bb0(%a0: i64, %b0: i64): sparse_tensor.yield %cst : i64 } left={} right={} linalg.yield %1 : i64 } -> tensor<?xi64, #SparseVector> and then see if we indeed generate the right code? I am not 100% sure how the part outside the new op will interact with codegen
mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
98	Please add a bit more description for both, i.e. origOp points to original unary/unary op, mergeOp points to ... before going in detail of what needs to be there
100	This is a better solution than the original nesting depth based one. Still, the danger of a struct is of course that it is easy to add members :-). And although I doubt this is on any memory profile yet, still a bit worrisome to grow the sizeof for all nodes now. So if we keep this, at least add a TODO on improving memory usage in the future. Furthermore, I am wondering if it perhaps would be better to reflect the state in the type (sort of what you had originally) and just store one op. All of this can be done later, just thinking out loud here.
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
797–800	did you see these comments?
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
143	For this first version, I actually prefer that we keep takeDisj intact (other than passing op) and add a new takeCombi method that implements your new logic, just so we don't need to touch other ops in this revision. We can perhaps merge these methods later into one, but now too much is changing at once.
527	splinter?
530	I patched in your revision, but got compilation errors here. I think you need to include #include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"
541	period at end
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir
42 ↗	(On Diff #424560)	Here, and probably for the unary too, scan we add a case where one of the operands is dense (either truly dense, or annotated with "dense"). Just to verify that a lattice with "universal" index is properly tested.
208 ↗	(On Diff #424560)	looks like these two are never release (thus will fail memsan)
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir
40 ↗	(On Diff #424560)	You have a test with absent, and one with present. How about also testing when both are set, just for completeness? Like your %result = sparse_tensor.unary %a : f64 to i32 present={ ^bb0(%x: f64): %ret = arith.constant 1 : i32 sparse_tensor.yield %ret : i32 } absent={ %ret = arith.constant -1 : i32 sparse_tensor.yield %ret : i32 } example? In fact, even though you test the present case with a matrix, how about also doing that for a vector, just so that all cases are covered for a single loop example. That way, the code also acts like a nice illustration of the feature
112 ↗	(On Diff #424560)	indentation is off
119 ↗	(On Diff #424560)	do we want to do the same thing here? i.e. print values first (to see that we only compute sparse values) and then print full matrix for stucture?
140 ↗	(On Diff #424560)	This is never released (and will thus fail our memsan test).

Change back to one Operation *

Add new kBinaryHalf kind
kUnary and kBinary hold their original operation until buildExp
During buildExp, locate the primary YieldOp and merge that
Add more tests

Harbormaster completed remote builds in B161461: Diff 425306.Apr 26 2022, 3:34 PM

jim22k added inline comments.Apr 27 2022, 7:28 AM

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447 ↗	(On Diff #424560)	This will not work. I get this error: /Users/jkitchen/Projects/HIVE/llvm-project/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir:118:16: error: 'arith.index_cast' op operation destroyed but still has uses %cst = arith.index_cast %idx : index to i32 ^ /Users/jkitchen/Projects/HIVE/llvm-project/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir:118:16: note: see current operation: %0 = "arith.index_cast"(<<NULL VALUE>>) : (<<NULL TYPE>>) -> i32 /Users/jkitchen/Projects/HIVE/llvm-project/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir:113:10: note: - use: "sparse_tensor.lex_insert"(%2, %21, <<UNKNOWN SSA VALUE>>) : (tensor<?xi32, #sparse_tensor.encoding<{ dimLevelType = [ "compressed" ], pointerBitWidth = 0, indexBitWidth = 0 }>>, memref<?xindex>, i32) -> () %0 = linalg.generic #trait_vec_op ^ LLVM ERROR: operation destroyed but still has uses The only way I have been able to use `linalg.index` is to pass its result into `sparse_tensor.binary` as one of the arguments, as seen on line 449. That seems to work fine.
mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
100	Alright, I settled on another new approach. We can make this work with only a single `Operation ` in the struct. kUnary: Operation is the `sparse_tensor.unary` kBinary: Operation * is the `sparse_tensor.binary` kUnaryHandled: Operation * is the YieldOp When buildLattices encounters kUnary or kBinary, it will build the lattice structure with potentially many regions (i.e. binary always calls takeDisj). The regions which can be considered fully handled will change to kUnaryHandled. The primary region of both unary and binary will remain as kUnary or kBinary and will hold on to the original operation (not the yield). This will let us continue to use the original at every level of the lattice. When we finally call buildExp, we can dig into the original unary and binary operations and pull out the primary region's YieldOp at that point. This approach also avoids confusion because kUnary and kBinary always refer to the correct sparse_tensor.unary and sparse_tensor.binary operations. Any disjoint offshoots use a different "Kind" to clarify that they are not the original operation anymore. I will update the PR with these changes.
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
797–800	I am adding a check to ensure that the kind is either kUnary or kBinary.
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
530	That should already be there. See line 11.

Sorry my comments are out of order with the new PR. They should be read as if they happened before the new PR. I forgot to submit them until now.

A few last comments. Sorry for being nitpicky on this, rest assured I really like the work, I am just a bit particular on certain things.
Also, can you please make a pass over all my comments and mark them "Done" (or comment on them otherwise).
That makes the re-review a bit easier, since a lot of my past comments still show up as unresolv.ed

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
440 ↗	(On Diff #425306)	See below, but in a nutshell, let's not rewrite the example just because the desired form does not work yet.
447 ↗	(On Diff #424560)	But by passing it as argument, we lose the intersection/union power of the binary op. Don't you agree that the code above should work, or the code example you had originally? (okay doing this later, but I would prefer that we show a binary op example in the doc that uses two sparse inputs and uses the index in one or more of the branches, and not rewrite the example to match the current implementation status) The problem is that we probably need some work dealing with the "outside" block computations that are only used in some places (and thus can be placed under the conditional). We can put a TODO if you prefer.
mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
49	Maybe add a comment, as in // semi-ring unary op
66	maybe add same-line comment as in // semi-ring binary op
100	Looks like you iterated over some naming, since I see kBinaryHalf now. Since we are bikeshedding, how about kBinaryBranch to show we have one case of a binary now?
mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
802	if the if-part has braces, so does the else-part, even when it is a single line stmt
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
24	: kind(k), val(v), op(operation) { and then nothing below, since these are not part of a union so can always be set Also, it feels weird to have a longer parameter name then field name, so perhaps call the parameter simply "o" (which is in style with the current conventions)
62	can we assert on y in the new model?
136	does this line apply to L144? If so, please please after that line
342	please follow same order as in enum decl.
525	just use cast (not dyn_cast) and no assert, since this should alway work
536	same here and below, if the cast should work, just cast
811	cast, we assume that verifier has filtered out bad cases
882	The unary and binary codegen blocks are a bit too large for this context (most others are oneliners) so please move into own method

jim22k marked 63 inline comments as done.Apr 28 2022, 9:16 AM

jim22k added inline comments.

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447 ↗	(On Diff #424560)	I would hope that your example test (and the documentation example I originally had) would work. If you are okay with a documentation example that gives an error, but is the desired end goal, I will revert to that. I am just sensitive to introducing a new feature with broken examples, giving people who want to try it out a bad impression of the work, unless you are going to figure out how to make it work before we merge.
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
62	We cannot. During the original call, y==-1. When `absentRegion` is empty, we call `mapSet` and y remains equal to -1. But if `absentRegion` is not empty, we call `takeDisj` and both x and y are set. For this "binary" case handling the absent region, when buildExp is called, we only utilize v0 and completely ignore v1, even though it exists from a lattice perspective. Would it be helpful to add a comment here about why we can't assert anything about `y`?

Updates based on feedback

Change kBinaryHalf to kBinaryBranch
Revert binaryop doc example
Remove dynamic casting

aartbik added inline comments.Apr 28 2022, 9:56 AM

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447 ↗	(On Diff #424560)	I am okay with that, but please add a small note on that in the doc (this is intended future behavior but still under construction ;-)
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
62	Yeah, please document, so that in the future I do not try to put the assert back ;-)

Harbormaster completed remote builds in B161824: Diff 425824.Apr 28 2022, 10:21 AM

Okay, adding more comments.

Few more comments

Harbormaster completed remote builds in B161858: Diff 425870.Apr 28 2022, 1:50 PM

A few last comments, but I am giving you the LGTM, since I think this is ready to go in, so we can work on refining the few remaining issues.
Thanks for your patience during the review, and thanks for this contribution!

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td
447 ↗	(On Diff #424560)	I don't see the note in the doc (I really meant, the user facing example, so that people trying this out are aware it does not work yet). So something like Example of A+B in upper triangle, A-B in lower triangle (not working yet, but construct will be available soon).
mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
804	mark these new helpers as "static" since they are private to the file

This revision is now accepted and ready to land.May 2 2022, 5:20 PM

Make helper functions static

@aartbik Thanks for reviewing and helping me converge on a good solution for lowering.
Once the build passes, I will merge it.

Harbormaster completed remote builds in B162474: Diff 426728.May 3 2022, 9:57 AM

Closed by commit rG2c3326608460: [mlir][sparse] Add lowering for unary and binary ops (authored by jim22k). · Explain WhyMay 3 2022, 1:51 PM

This revision was automatically updated to reflect the committed changes.

jim22k added a commit: rG2c3326608460: [mlir][sparse] Add lowering for unary and binary ops.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

SparseTensor/

Utils/

Merger.h

36 lines

lib/

Dialect/

SparseTensor/

Transforms/

Sparsification.cpp

14 lines

Utils/

Merger.cpp

216 lines

test/

Integration/

Dialect/

SparseTensor/

CPU/

sparse_matrix_ops.mlir

76 lines

sparse_vector_ops.mlir

129 lines

unittests/

Dialect/

SparseTensor/

MergerTest.cpp

4 lines

Diff 423617

mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h

Show All 40 Lines	enum Kind {
kCastFU, // unsigned		kCastFU, // unsigned
kCastSF, // signed		kCastSF, // signed
kCastUF, // unsigned		kCastUF, // unsigned
kCastS, // signed		kCastS, // signed
kCastU, // unsigned		kCastU, // unsigned
kCastIdx,		kCastIdx,
kTruncI,		kTruncI,
kBitCast,		kBitCast,
		kUnary,
		aartbikUnsubmitted Done Reply Inline Actions Maybe add a comment, as in // semi-ring unary op aartbik: Maybe add a comment, as in // semi-ring unary op
// Binary operations.		// Binary operations.
		aartbikUnsubmitted Done Reply Inline Actions At first reading, I expected the kUnaryRegion and kBinaryRegion, but was a bit surprised with the kUnary and kBinary (since that generalizes a lot of the ops already there). Can you briefly document what the intended differences are here? Scanning ahead, it seems like one is eventually broken up into several of the others, but I wonder if we can't just keep a single kUnaryRegion and kBinaryRegion, and let the op itself drive the codegen? aartbik: At first reading, I expected the kUnaryRegion and kBinaryRegion, but was a bit surprised with…
kMulF,		kMulF,
kMulI,		kMulI,
kDivF,		kDivF,
kDivS, // signed		kDivS, // signed
kDivU, // unsigned		kDivU, // unsigned
kAddF,		kAddF,
kAddI,		kAddI,
kSubF,		kSubF,
kSubI,		kSubI,
kAndI,		kAndI,
kOrI,		kOrI,
kXorI,		kXorI,
kShrS, // signed		kShrS, // signed
kShrU, // unsigned		kShrU, // unsigned
kShlI,		kShlI,
		kBinary,
		aartbikUnsubmitted Done Reply Inline Actions maybe add same-line comment as in // semi-ring binary op aartbik: maybe add same-line comment as in // semi-ring binary op
};		};

/// Children subexpressions of tensor operations.		/// Children subexpressions of tensor operations.
struct Children {		struct Children {
unsigned e0;		unsigned e0;
unsigned e1;		unsigned e1;
};		};

/// Tensor expression. Represents a MLIR expression in tensor index notation.		/// Tensor expression. Represents a MLIR expression in tensor index notation.
struct TensorExp {		struct TensorExp {
TensorExp(Kind k, unsigned x, unsigned y, Value v);		TensorExp(Kind k, unsigned x, unsigned y, Value v, Operation *op);

/// Tensor expression kind.		/// Tensor expression kind.
Kind kind;		Kind kind;

union {		union {
/// Expressions representing tensors simply have a tensor number.		/// Expressions representing tensors simply have a tensor number.
unsigned tensor;		unsigned tensor;

/// Indices hold the index number.		/// Indices hold the index number.
unsigned index;		unsigned index;

/// Tensor operations hold the indices of their children.		/// Tensor operations hold the indices of their children.
Children children;		Children children;
};		};

/// Direct link to IR for an invariant or the destination value (to		/// Direct link to IR for an invariant or the destination value (to
/// infer destination type) of a cast operation During code generation,		/// infer destination type) of a cast operation During code generation,
/// this field may be used to cache "hoisted" loop invariant tensor loads.		/// this field may be used to cache "hoisted" loop invariant tensor loads.
Value val;		Value val;

		/// Holder for block of custom code to be merged.
		aartbikUnsubmitted Done Reply Inline Actions style: period at end in comment Also, originally I avoided linking this TensorExpr back to IR of code in MLIR, since during lattice construction, one must be careful to maintain the right 1:1 correspondence between the lattice point and the IR. Obviously, we have little choice for the new binary/unary regions, but let's document that here carefully. aartbik: style: period at end in comment Also, originally I avoided linking this TensorExpr back to IR…
		aartbikUnsubmitted Done Reply Inline Actions Note that technically this field is only used for unary and binary, but putting this in a union or so seems too convoluted. But just add a comment on when this field is set. aartbik: Note that technically this field is only used for unary and binary, but putting this in a union…
		aartbikUnsubmitted Done Reply Inline Actions Please add a bit more description for both, i.e. origOp points to original unary/unary op, mergeOp points to ... before going in detail of what needs to be there aartbik: Please add a bit more description for both, i.e. origOp points to original unary/unary op…
		Operation *operation;
};		};
		aartbikUnsubmitted Done Reply Inline Actions This is a better solution than the original nesting depth based one. Still, the danger of a struct is of course that it is easy to add members :-). And although I doubt this is on any memory profile yet, still a bit worrisome to grow the sizeof for all nodes now. So if we keep this, at least add a TODO on improving memory usage in the future. Furthermore, I am wondering if it perhaps would be better to reflect the state in the type (sort of what you had originally) and just store one op. All of this can be done later, just thinking out loud here. aartbik: This is a better solution than the original nesting depth based one. Still, the danger of a…
		jim22kAuthorUnsubmitted Done Reply Inline Actions Alright, I settled on another new approach. We can make this work with only a single `Operation ` in the struct. kUnary: Operation is the `sparse_tensor.unary` kBinary: Operation * is the `sparse_tensor.binary` kUnaryHandled: Operation * is the YieldOp When buildLattices encounters kUnary or kBinary, it will build the lattice structure with potentially many regions (i.e. binary always calls takeDisj). The regions which can be considered fully handled will change to kUnaryHandled. The primary region of both unary and binary will remain as kUnary or kBinary and will hold on to the original operation (not the yield). This will let us continue to use the original at every level of the lattice. When we finally call buildExp, we can dig into the original unary and binary operations and pull out the primary region's YieldOp at that point. This approach also avoids confusion because kUnary and kBinary always refer to the correct sparse_tensor.unary and sparse_tensor.binary operations. Any disjoint offshoots use a different "Kind" to clarify that they are not the original operation anymore. I will update the PR with these changes. jim22k: Alright, I settled on another new approach. We can make this work with only a single `Operation…
		aartbikUnsubmitted Done Reply Inline Actions Looks like you iterated over some naming, since I see kBinaryHalf now. Since we are bikeshedding, how about kBinaryBranch to show we have one case of a binary now? aartbik: Looks like you iterated over some naming, since I see kBinaryHalf now. Since we are…

/// Lattice point. Each lattice point consists of a conjunction of tensor		/// Lattice point. Each lattice point consists of a conjunction of tensor
/// loop indices (encoded in a bitvector) and the index of the corresponding		/// loop indices (encoded in a bitvector) and the index of the corresponding
/// tensor expression.		/// tensor expression.
struct LatPoint {		struct LatPoint {
LatPoint(unsigned n, unsigned e, unsigned b);		LatPoint(unsigned n, unsigned e, unsigned b);
LatPoint(const BitVector &b, unsigned e);		LatPoint(const BitVector &b, unsigned e);

/// Conjunction of tensor loop indices as bitvector. This represents		/// Conjunction of tensor loop indices as bitvector. This represents
/// all indices involved in the tensor expression		/// all indices involved in the tensor expression
BitVector bits;		BitVector bits;

/// Simplified conjunction of tensor loop indices as bitvector. This		/// Simplified conjunction of tensor loop indices as bitvector. This
/// represents a simplified condition under which this tensor expression		/// represents a simplified condition under which this tensor expression
/// must execute. Pre-computed during codegen to avoid repeated eval.		/// must execute. Pre-computed during codegen to avoid repeated eval.
BitVector simple;		BitVector simple;

/// Index of the tensor expresssion.		/// Index of the tensor expression.
unsigned exp;		unsigned exp;
};		};

/// A class to handle all iteration lattice operations. This class abstracts		/// A class to handle all iteration lattice operations. This class abstracts
/// away from some implementation details of storing iteration lattices and		/// away from some implementation details of storing iteration lattices and
/// tensor expressions. This allows for fine-tuning performance characteristics		/// tensor expressions. This allows for fine-tuning performance characteristics
/// independently from the basic algorithm if bottlenecks are identified.		/// independently from the basic algorithm if bottlenecks are identified.
class Merger {		class Merger {
public:		public:
/// Constructs a merger for the given number of tensors and loops. The		/// Constructs a merger for the given number of tensors and loops. The
/// user supplies the number of tensors involved in the kernel, with the		/// user supplies the number of tensors involved in the kernel, with the
/// last tensor in this set denoting the output tensor. The merger adds an		/// last tensor in this set denoting the output tensor. The merger adds an
/// additional synthetic tensor at the end of this set to represent all		/// additional synthetic tensor at the end of this set to represent all
/// invariant expressions in the kernel.		/// invariant expressions in the kernel.
Merger(unsigned t, unsigned l)		Merger(unsigned t, unsigned l)
: outTensor(t - 1), syntheticTensor(t), numTensors(t + 1), numLoops(l),		: outTensor(t - 1), syntheticTensor(t), numTensors(t + 1), numLoops(l),
hasSparseOut(false), dims(t + 1, std::vector<Dim>(l, Dim::kUndef)) {}		hasSparseOut(false), dims(t + 1, std::vector<Dim>(l, Dim::kUndef)) {}

/// Adds a tensor expression. Returns its index.		/// Adds a tensor expression. Returns its index.
unsigned addExp(Kind k, unsigned e0, unsigned e1 = -1u, Value v = Value());		unsigned addExp(Kind k, unsigned e0, unsigned e1 = -1u, Value v = Value(),
unsigned addExp(Kind k, unsigned e, Value v) { return addExp(k, e, -1u, v); }		Operation *op = nullptr);
unsigned addExp(Kind k, Value v) { return addExp(k, -1u, -1u, v); }		unsigned addExp(Kind k, unsigned e, Value v, Operation *op = nullptr) {
		return addExp(k, e, -1u, v, op);
		}
		unsigned addExp(Kind k, Value v, Operation *op = nullptr) {
		return addExp(k, -1u, -1u, v, op);
		}

/// Adds an iteration lattice point. Returns its index.		/// Adds an iteration lattice point. Returns its index.
unsigned addLat(unsigned t, unsigned i, unsigned e);		unsigned addLat(unsigned t, unsigned i, unsigned e);

/// Adds a new, initially empty, set. Returns its index.		/// Adds a new, initially empty, set. Returns its index.
unsigned addSet();		unsigned addSet();

/// Computes a single conjunction of two lattice points by taking the "union"		/// Computes a single conjunction of two lattice points by taking the "union"
/// of loop indices (effectively constructing a larger "intersection" of those		/// of loop indices (effectively constructing a larger "intersection" of those
/// indices) with a newly constructed tensor (sub)expression of given kind.		/// indices) with a newly constructed tensor (sub)expression of given kind.
/// Returns the index of the new lattice point.		/// Returns the index of the new lattice point.
unsigned conjLatPoint(Kind kind, unsigned p0, unsigned p1);		unsigned conjLatPoint(Kind kind, unsigned p0, unsigned p1,
		Operation *op = nullptr);
		aartbikUnsubmitted Done Reply Inline Actions ah, I see you solved the issue I alluded to above by passing this explicitly to the merge operations; let's brainstorm later if we can somehow do this using the child nodes only... aartbik: ah, I see you solved the issue I alluded to above by passing this explicitly to the merge…
		aartbikUnsubmitted Done Reply Inline Actions I am okay with this for now (and perhaps permanently). It at least makes semantics at top level explicit. aartbik: I am okay with this for now (and perhaps permanently). It at least makes semantics at top level…

/// Conjunctive merge of two lattice sets L0 and L1 is conjunction of		/// Conjunctive merge of two lattice sets L0 and L1 is conjunction of
/// cartesian product. Returns the index of the new set.		/// cartesian product. Returns the index of the new set.
unsigned takeConj(Kind kind, unsigned s0, unsigned s1);		unsigned takeConj(Kind kind, unsigned s0, unsigned s1,
		Operation *op = nullptr);

/// Disjunctive merge of two lattice sets L0 and L1 is (L0 /\_op L1, L0, L1).		/// Disjunctive merge of two lattice sets L0 and L1 is (L0 /\_op L1, L0, L1).
/// Returns the index of the new set.		/// Returns the index of the new set.
unsigned takeDisj(Kind kind, unsigned s0, unsigned s1);		unsigned takeDisj(Kind kind, unsigned s0, unsigned s1, Operation *op);
		unsigned takeDisj(Kind kind, unsigned s0, unsigned s1, bool includeLeft,
		bool includeRight, Operation opboth, Operation opleft,
		Operation *opright);

/// Maps the unary operator over the lattice set of the operand, i.e. each		/// Maps the unary operator over the lattice set of the operand, i.e. each
/// lattice point on an expression E is simply copied over, but with OP E		/// lattice point on an expression E is simply copied over, but with OP E
/// as new expression. Returns the index of the new set.		/// as new expression. Returns the index of the new set.
unsigned mapSet(Kind kind, unsigned s0, Value v = Value());		unsigned mapSet(Kind kind, unsigned s0, Value v = Value(),
		Operation *op = nullptr);

/// Optimizes the iteration lattice points in the given set. This		/// Optimizes the iteration lattice points in the given set. This
/// method should be called right before code generation to avoid		/// method should be called right before code generation to avoid
/// generating redundant loops and conditions.		/// generating redundant loops and conditions.
unsigned optimizeSet(unsigned s0);		unsigned optimizeSet(unsigned s0);

/// Simplifies the conditions in a conjunction of a given lattice point		/// Simplifies the conditions in a conjunction of a given lattice point
/// within the given set using just two basic rules:		/// within the given set using just two basic rules:
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	#ifndef NDEBUG
void dumpLat(unsigned p) const;		void dumpLat(unsigned p) const;
void dumpSet(unsigned s) const;		void dumpSet(unsigned s) const;
void dumpBits(const BitVector &bits) const;		void dumpBits(const BitVector &bits) const;
#endif		#endif

/// Builds the iteration lattices in a bottom-up traversal given the remaining		/// Builds the iteration lattices in a bottom-up traversal given the remaining
/// tensor (sub)expression and the next loop index in the iteration graph.		/// tensor (sub)expression and the next loop index in the iteration graph.
/// Returns index of the root expression.		/// Returns index of the root expression.
unsigned buildLattices(unsigned e, unsigned i);		unsigned buildLattices(unsigned e, unsigned im, unsigned z);
		aartbikUnsubmitted Done Reply Inline Actions "e" and "i" have some designated meaning, so please don't change into im. Here "z" can use a comment in the method description. But I am not sure about this approach, see below. aartbik: "e" and "i" have some designated meaning, so please don't change into im. Here "z" can use a…

/// Builds a tensor expression from the given Linalg operation.		/// Builds a tensor expression from the given Linalg operation.
/// Returns index of the root expression on success.		/// Returns index of the root expression on success.
Optional<unsigned> buildTensorExpFromLinalg(linalg::GenericOp op);		Optional<unsigned> buildTensorExpFromLinalg(linalg::GenericOp op);

/// Rebuilds SSA format from a tensor expression.		/// Rebuilds SSA format from a tensor expression.
Value buildExp(PatternRewriter &rewriter, Location loc, unsigned e, Value v0,		Value buildExp(PatternRewriter &rewriter, Location loc, unsigned e, Value v0,
Value v1);		Value v1);
Show All 26 Lines

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp

Show First 20 Lines • Show All 788 Lines • ▼ Show 20 Lines	if (codegen.curVecLength > 1)
rhs = rewriter.create<arith::SelectOp>(loc, codegen.curVecMask, rhs,		rhs = rewriter.create<arith::SelectOp>(loc, codegen.curVecMask, rhs,
codegen.redVal);		codegen.redVal);
updateReduc(merger, codegen, rhs);		updateReduc(merger, codegen, rhs);
return;		return;
}		}
// Store during insertion.		// Store during insertion.
OpOperand *t = op.getOutputOperand(0);		OpOperand *t = op.getOutputOperand(0);
if (t == codegen.sparseOut) {		if (t == codegen.sparseOut) {
		// A few Kinds have conditional output (ex. sparse_tensor.unary) and
		// indicate no output by passing an unitialized Value().
		if (rhs)
genInsertionStore(codegen, rewriter, op, t, rhs);		genInsertionStore(codegen, rewriter, op, t, rhs);
		aartbikUnsubmitted Done Reply Inline Actions This is very surprising? Is this right? aartbik: This is very surprising? Is this right?
		jim22kAuthorUnsubmitted Done Reply Inline Actions This is related to my handling of binary[overlap] and unary[present] when those regions are empty. If you look at Merger.cpp:617, it passes a nullptr for the `Operation`. That, in turn, manifests in line 886 and returns `Value()` for the Value output. When that empty Value reaches this code, we skip the insertion. It works, but it does feel like a hack. Can you think of a better way to indicate to the generating code that no output is needed? jim22k:* This is related to my handling of binary[overlap] and unary[present] when those regions are…
		aartbikUnsubmitted Done Reply Inline Actions It was unexpected, but am okay with this if (1) it is documented and (2) perhaps we can even add some form of assert when !rhs that verifies that indeed we hit an expected case aartbik: It was unexpected, but am okay with this if (1) it is documented and (2) perhaps we can even…
		aartbikUnsubmitted Done Reply Inline Actions It is a bit strange to refer to "Kinds" here, just rephrase the comment in general terms. Anything we can assert on rhs or t if not set? aartbik: It is a bit strange to refer to "Kinds" here, just rephrase the comment in general terms.
		aartbikUnsubmitted Done Reply Inline Actions did you see these comments? aartbik: did you see these comments?
		jim22kAuthorUnsubmitted Done Reply Inline Actions I am adding a check to ensure that the kind is either kUnary or kBinary. jim22k: I am adding a check to ensure that the kind is either kUnary or kBinary.
return;		return;
}		}
		aartbikUnsubmitted Done Reply Inline Actions if the if-part has braces, so does the else-part, even when it is a single line stmt aartbik: if the if-part has braces, so does the else-part, even when it is a single line stmt
// Actual store.		// Actual store.
SmallVector<Value, 4> args;		SmallVector<Value, 4> args;
Value ptr = genSubscript(codegen, rewriter, op, t, args);		Value ptr = genSubscript(codegen, rewriter, op, t, args);
if (codegen.curVecLength > 1)		if (codegen.curVecLength > 1)
genVectorStore(codegen, rewriter, rhs, ptr, args);		genVectorStore(codegen, rewriter, rhs, ptr, args);
else		else
rewriter.create<memref::StoreOp>(loc, rhs, ptr, args);		rewriter.create<memref::StoreOp>(loc, rhs, ptr, args);
}		}
▲ Show 20 Lines • Show All 404 Lines • ▼ Show 20 Lines	static Operation *genFor(Merger &merger, CodeGen &codegen,
if (isVector)		if (isVector)
codegen.curVecMask = genVectorMask(codegen, rewriter, iv, lo, hi, step);		codegen.curVecMask = genVectorMask(codegen, rewriter, iv, lo, hi, step);
return forOp;		return forOp;
}		}

/// Emit a while-loop for co-iteration over multiple indices.		/// Emit a while-loop for co-iteration over multiple indices.
static Operation *genWhile(Merger &merger, CodeGen &codegen,		static Operation *genWhile(Merger &merger, CodeGen &codegen,
PatternRewriter &rewriter, linalg::GenericOp op,		PatternRewriter &rewriter, linalg::GenericOp op,
unsigned idx, bool needsUniv,		unsigned idx, bool needsUniv, BitVector &indices) {
		aartbikUnsubmitted Done Reply Inline Actions unnecessary format change? aartbik: unnecessary format change?
		jim22kAuthorUnsubmitted Done Reply Inline Actions clang-format made this change; not sure why jim22k: clang-format made this change; not sure why
BitVector &indices) {
SmallVector<Type, 4> types;		SmallVector<Type, 4> types;
SmallVector<Value, 4> operands;		SmallVector<Value, 4> operands;
// Construct the while-loop with a parameter for each index.		// Construct the while-loop with a parameter for each index.
Type indexType = rewriter.getIndexType();		Type indexType = rewriter.getIndexType();
for (unsigned b = 0, be = indices.size(); b < be; b++) {		for (unsigned b = 0, be = indices.size(); b < be; b++) {
if (indices[b] && merger.isDim(b, Dim::kSparse)) {		if (indices[b] && merger.isDim(b, Dim::kSparse)) {
unsigned tensor = merger.tensor(b);		unsigned tensor = merger.tensor(b);
assert(idx == merger.index(b));		assert(idx == merger.index(b));
▲ Show 20 Lines • Show All 130 Lines • ▼ Show 20 Lines	rewriter.create<memref::StoreOp>(loc, codegen.loops[idx], codegen.lexIdx,
pos);		pos);
}		}
}		}

/// Generates the induction structure for a while-loop.		/// Generates the induction structure for a while-loop.
static void genWhileInduction(Merger &merger, CodeGen &codegen,		static void genWhileInduction(Merger &merger, CodeGen &codegen,
PatternRewriter &rewriter, linalg::GenericOp op,		PatternRewriter &rewriter, linalg::GenericOp op,
unsigned idx, bool needsUniv,		unsigned idx, bool needsUniv,
BitVector &induction,		BitVector &induction, scf::WhileOp whileOp) {
		aartbikUnsubmitted Done Reply Inline Actions unnecessary format change? aartbik: unnecessary format change?
		jim22kAuthorUnsubmitted Done Reply Inline Actions same as above -- clang-format wanted this changed jim22k: same as above -- clang-format wanted this changed
scf::WhileOp whileOp) {
Location loc = op.getLoc();		Location loc = op.getLoc();
// Finalize each else branch of all if statements.		// Finalize each else branch of all if statements.
if (codegen.redVal \|\| codegen.expValues) {		if (codegen.redVal \|\| codegen.expValues) {
while (auto ifOp = dyn_cast_or_null<scf::IfOp>(		while (auto ifOp = dyn_cast_or_null<scf::IfOp>(
rewriter.getInsertionBlock()->getParentOp())) {		rewriter.getInsertionBlock()->getParentOp())) {
unsigned y = 0;		unsigned y = 0;
SmallVector<Value, 4> yields;		SmallVector<Value, 4> yields;
if (codegen.redVal) {		if (codegen.redVal) {
▲ Show 20 Lines • Show All 215 Lines • ▼ Show 20 Lines	if (at == topSort.size()) {
Value rhs = genExp(merger, codegen, rewriter, op, exp, ldx);		Value rhs = genExp(merger, codegen, rewriter, op, exp, ldx);
genTensorStore(merger, codegen, rewriter, op, rhs);		genTensorStore(merger, codegen, rewriter, op, rhs);
return;		return;
}		}

// Construct iteration lattices for current loop index, with L0 at top.		// Construct iteration lattices for current loop index, with L0 at top.
unsigned idx = topSort[at];		unsigned idx = topSort[at];
unsigned ldx = at == 0 ? -1u : topSort[at - 1];		unsigned ldx = at == 0 ? -1u : topSort[at - 1];
unsigned lts = merger.optimizeSet(merger.buildLattices(exp, idx));		unsigned lts =
		merger.optimizeSet(merger.buildLattices(exp, idx, topSort.size() - at));
		jim22kAuthorUnsubmitted Done Reply Inline Actions This is new and is my attempt to avoid calling kBinary twice so we avoid the need for a "no-op" return. For matrices, there are two loops, so this piece of code is called twice. If we split apart the binary operation, we run into the need for a no-op. By passing `topSort.size() - at`, we only split apart the binary in the lowest loop. Previous calls will be a simple takeConj with the binary operation passed unchanged. jim22k: This is new and is my attempt to avoid calling kBinary twice so we avoid the need for a "no-op"…
		aartbikUnsubmitted Done Reply Inline Actions This feels hacky. In the original iteration space theory, building the lattice sets solely depends on the current index, and it has nothing to do with the nesting depth or anything like that. In principle, we need to make this call at every loop nest to determine the remaining expressions. aartbik: This feels hacky. In the original iteration space theory, building the lattice sets solely…
		jim22kAuthorUnsubmitted Done Reply Inline Actions I agree. This was a quick and dirty approach, but I've convinced myself that it is not just hacky, but actually gives wrong results. I figured out another way which is giving better results and avoids the need for this hack. I will clean up the code and show it off in the next revision. jim22k: I agree. This was a quick and dirty approach, but I've convinced myself that it is not just…
		aartbikUnsubmitted Done Reply Inline Actions Looking forward to that. Thanks! aartbik: Looking forward to that. Thanks!

// Start a loop sequence.		// Start a loop sequence.
bool needsUniv = startLoopSeq(merger, codegen, rewriter, op, topSort, exp, at,		bool needsUniv = startLoopSeq(merger, codegen, rewriter, op, topSort, exp, at,
idx, ldx, lts);		idx, ldx, lts);

// Emit a loop for every lattice point L0 >= Li in this loop sequence.		// Emit a loop for every lattice point L0 >= Li in this loop sequence.
unsigned lsize = merger.set(lts).size();		unsigned lsize = merger.set(lts).size();
for (unsigned i = 0; i < lsize; i++) {		for (unsigned i = 0; i < lsize; i++) {
▲ Show 20 Lines • Show All 124 Lines • Show Last 20 Lines

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp

//===- Merger.cpp - Implementation of iteration lattices ------------------===//		//===- Merger.cpp - Implementation of iteration lattices ------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "mlir/Dialect/SparseTensor/Utils/Merger.h"		#include "mlir/Dialect/SparseTensor/Utils/Merger.h"
#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"		#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"
		#include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"

#include "mlir/IR/Operation.h"		#include "mlir/IR/Operation.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"

namespace mlir {		namespace mlir {
namespace sparse_tensor {		namespace sparse_tensor {

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Constructors.		// Constructors.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

TensorExp::TensorExp(Kind k, unsigned x, unsigned y, Value v)		TensorExp::TensorExp(Kind k, unsigned x, unsigned y, Value v, Operation *op)
: kind(k), val(v) {		: kind(k), val(v) {
		aartbikUnsubmitted Done Reply Inline Actions : kind(k), val(v), op(operation) { and then nothing below, since these are not part of a union so can always be set Also, it feels weird to have a longer parameter name then field name, so perhaps call the parameter simply "o" (which is in style with the current conventions) aartbik: : kind(k), val(v), op(operation) { and then nothing below, since these are not part of a union…
switch (kind) {		switch (kind) {
		aartbikUnsubmitted Done Reply Inline Actions this is not completely in style with the general setup: only assign the fields that are know to be set, assert on nullptr in all others aartbik: this is not completely in style with the general setup: only assign the fields that are know…
case kTensor:		case kTensor:
assert(x != -1u && y == -1u && !v);		assert(x != -1u && y == -1u && !v && !op);
tensor = x;		tensor = x;
break;		break;
case kInvariant:		case kInvariant:
assert(x == -1u && y == -1u && v);		assert(x == -1u && y == -1u && v && !op);
break;		break;
case kIndex:		case kIndex:
assert(x != -1u && y == -1u && !v);		assert(x != -1u && y == -1u && !v && !op);
index = x;		index = x;
break;		break;
case kAbsF:		case kAbsF:
case kCeilF:		case kCeilF:
case kFloorF:		case kFloorF:
case kNegF:		case kNegF:
case kNegI:		case kNegI:
assert(x != -1u && y == -1u && !v);		assert(x != -1u && y == -1u && !v && !op);
children.e0 = x;		children.e0 = x;
children.e1 = y;		children.e1 = y;
break;		break;
case kTruncF:		case kTruncF:
case kExtF:		case kExtF:
case kCastFS:		case kCastFS:
case kCastFU:		case kCastFU:
case kCastSF:		case kCastSF:
case kCastUF:		case kCastUF:
case kCastS:		case kCastS:
case kCastU:		case kCastU:
case kCastIdx:		case kCastIdx:
case kTruncI:		case kTruncI:
case kBitCast:		case kBitCast:
assert(x != -1u && y == -1u && v);		assert(x != -1u && y == -1u && v && !op);
children.e0 = x;		children.e0 = x;
children.e1 = y;		children.e1 = y;
break;		break;
		case kUnary:
		children.e0 = x;
		aartbikUnsubmitted Done Reply Inline Actions assert on op at least aartbik: assert on op at least
		aartbikUnsubmitted Done Reply Inline Actions can we assert on y in the new model? aartbik: can we assert on y in the new model?
		jim22kAuthorUnsubmitted Done Reply Inline Actions We cannot. During the original call, y==-1. When `absentRegion` is empty, we call `mapSet` and y remains equal to -1. But if `absentRegion` is not empty, we call `takeDisj` and both x and y are set. For this "binary" case handling the absent region, when buildExp is called, we only utilize v0 and completely ignore v1, even though it exists from a lattice perspective. Would it be helpful to add a comment here about why we can't assert anything about `y`? jim22k: We cannot. During the original call, y==-1. When `absentRegion` is empty, we call `mapSet` and…
		aartbikUnsubmitted Done Reply Inline Actions Yeah, please document, so that in the future I do not try to put the assert back ;-) aartbik: Yeah, please document, so that in the future I do not try to put the assert back ;-)
		children.e1 = y;
		operation = op;
		jim22kAuthorUnsubmitted Done Reply Inline Actions This is a weaker assert, but is required because splitting up the binary result in some pieces which are fundamentally unary, so they don't have a `y`. In this iteration, I am still labeling all the split up chunks of binary as kBinary. jim22k: This is a weaker assert, but is required because splitting up the binary result in some pieces…
		aartbikUnsubmitted Done Reply Inline Actions This feels hacky. I found these asserts very useful to ensure we are truly doing the right thing. If we introduce all sorts of intermediate stages, as in binary but marked unary etc, such important invariants are lost. Can we cleanup the "top level" logic that transforms binary into unary (this is e.g. also done around binary minus into unary minus) aartbik: This feels hacky. I found these asserts very useful to ensure we are truly doing the right…
		break;
		case kBinary:
		assert(x != -1u && !v);
		aartbikUnsubmitted Done Reply Inline Actions && op aartbik: && op
		children.e0 = x;
		children.e1 = y;
		operation = op;
		break;
		aartbikUnsubmitted Done Reply Inline Actions I think kUnary/BinaryRegion should be new cases that set op, all others assert !op aartbik: I think kUnary/BinaryRegion should be new cases that set op, all others assert !op
default:		default:
assert(x != -1u && y != -1u && !v);		assert(x != -1u && y != -1u && !v && !op);
children.e0 = x;		children.e0 = x;
children.e1 = y;		children.e1 = y;
break;		break;
}		}
}		}

LatPoint::LatPoint(unsigned n, unsigned e, unsigned b)		LatPoint::LatPoint(unsigned n, unsigned e, unsigned b)
: bits(n, false), simple(), exp(e) {		: bits(n, false), simple(), exp(e) {
bits.set(b);		bits.set(b);
}		}

LatPoint::LatPoint(const BitVector &b, unsigned e)		LatPoint::LatPoint(const BitVector &b, unsigned e)
: bits(b), simple(), exp(e) {}		: bits(b), simple(), exp(e) {}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Lattice methods.		// Lattice methods.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

unsigned Merger::addExp(Kind k, unsigned e0, unsigned e1, Value v) {		unsigned Merger::addExp(Kind k, unsigned e0, unsigned e1, Value v,
		Operation *op) {
unsigned e = tensorExps.size();		unsigned e = tensorExps.size();
tensorExps.push_back(TensorExp(k, e0, e1, v));		tensorExps.push_back(TensorExp(k, e0, e1, v, op));
return e;		return e;
}		}

unsigned Merger::addLat(unsigned t, unsigned i, unsigned e) {		unsigned Merger::addLat(unsigned t, unsigned i, unsigned e) {
assert(t < numTensors && i < numLoops);		assert(t < numTensors && i < numLoops);
unsigned p = latPoints.size();		unsigned p = latPoints.size();
latPoints.push_back(LatPoint(numLoops * numTensors, e, numTensors * i + t));		latPoints.push_back(LatPoint(numLoops * numTensors, e, numTensors * i + t));
return p;		return p;
}		}

unsigned Merger::addSet() {		unsigned Merger::addSet() {
unsigned s = latSets.size();		unsigned s = latSets.size();
latSets.emplace_back(SmallVector<unsigned, 16>());		latSets.emplace_back(SmallVector<unsigned, 16>());
return s;		return s;
}		}

unsigned Merger::conjLatPoint(Kind kind, unsigned p0, unsigned p1) {		unsigned Merger::conjLatPoint(Kind kind, unsigned p0, unsigned p1,
		Operation *op) {
unsigned p = latPoints.size();		unsigned p = latPoints.size();
BitVector nb = BitVector(latPoints[p0].bits);		BitVector nb = BitVector(latPoints[p0].bits);
nb \|= latPoints[p1].bits;		nb \|= latPoints[p1].bits;
unsigned e = addExp(kind, latPoints[p0].exp, latPoints[p1].exp);		unsigned e = addExp(kind, latPoints[p0].exp, latPoints[p1].exp, Value(), op);
latPoints.push_back(LatPoint(nb, e));		latPoints.push_back(LatPoint(nb, e));
return p;		return p;
}		}

unsigned Merger::takeConj(Kind kind, unsigned s0, unsigned s1) {		unsigned Merger::takeConj(Kind kind, unsigned s0, unsigned s1, Operation *op) {
unsigned s = addSet();		unsigned s = addSet();
for (unsigned p0 : latSets[s0])		for (unsigned p0 : latSets[s0])
for (unsigned p1 : latSets[s1])		for (unsigned p1 : latSets[s1])
latSets[s].push_back(conjLatPoint(kind, p0, p1));		latSets[s].push_back(conjLatPoint(kind, p0, p1, op));
return s;		return s;
}		}

unsigned Merger::takeDisj(Kind kind, unsigned s0, unsigned s1) {		unsigned Merger::takeDisj(Kind kind, unsigned s0, unsigned s1,
unsigned s = takeConj(kind, s0, s1);		Operation *op = nullptr) {
		unsigned s = takeConj(kind, s0, s1, op);
// Followed by all in s0.		// Followed by all in s0.
for (unsigned p : latSets[s0])		for (unsigned p : latSets[s0])
latSets[s].push_back(p);		latSets[s].push_back(p);
// Map binary 0-y to unary -y.		// Map binary 0-y to unary -y.
		aartbikUnsubmitted Done Reply Inline Actions does this line apply to L144? If so, please please after that line aartbik: does this line apply to L144? If so, please please after that line
if (kind == kSubF)		if (kind == kSubF)
s1 = mapSet(kNegF, s1);		s1 = mapSet(kNegF, s1, Value(), op);
else if (kind == kSubI)		else if (kind == kSubI)
s1 = mapSet(kNegI, s1);		s1 = mapSet(kNegI, s1, Value(), op);
// Followed by all in s1.		// Followed by all in s1.
for (unsigned p : latSets[s1])		for (unsigned p : latSets[s1])
latSets[s].push_back(p);		latSets[s].push_back(p);
		aartbikUnsubmitted Done Reply Inline Actions For this first version, I actually prefer that we keep takeDisj intact (other than passing op) and add a new takeCombi method that implements your new logic, just so we don't need to touch other ops in this revision. We can perhaps merge these methods later into one, but now too much is changing at once. aartbik: For this first version, I actually prefer that we keep takeDisj intact (other than passing op)…
return s;		return s;
}		}

unsigned Merger::mapSet(Kind kind, unsigned s0, Value v) {		unsigned Merger::takeDisj(Kind kind, unsigned s0, unsigned s1, bool includeLeft,
assert(kAbsF <= kind && kind <= kBitCast);		bool includeRight, Operation *opboth,
		Operation opleft, Operation opright) {
		unsigned s = takeConj(kind, s0, s1, opboth);
		// Left Region
		aartbikUnsubmitted Done Reply Inline Actions period at end, here and below. aartbik: period at end, here and below.
		if (includeLeft) {
		if (opleft)
		s0 = mapSet(kind, s0, Value(), opleft);
		for (unsigned p : latSets[s0])
		latSets[s].push_back(p);
		}
		// Right Region
		if (includeRight) {
		if (opright)
		s1 = mapSet(kind, s1, Value(), opright);
		for (unsigned p : latSets[s1])
		latSets[s].push_back(p);
		}
		return s;
		}

		unsigned Merger::mapSet(Kind kind, unsigned s0, Value v, Operation *op) {
		assert(kind == kBinary \|\| (kAbsF <= kind && kind <= kUnary));
unsigned s = addSet();		unsigned s = addSet();
for (unsigned p : latSets[s0]) {		for (unsigned p : latSets[s0]) {
unsigned e = addExp(kind, latPoints[p].exp, v);		unsigned e = addExp(kind, latPoints[p].exp, v, op);
latPoints.push_back(LatPoint(latPoints[p].bits, e));		latPoints.push_back(LatPoint(latPoints[p].bits, e));
latSets[s].push_back(latPoints.size() - 1);		latSets[s].push_back(latPoints.size() - 1);
}		}
return s;		return s;
}		}

unsigned Merger::optimizeSet(unsigned s0) {		unsigned Merger::optimizeSet(unsigned s0) {
unsigned s = addSet();		unsigned s = addSet();
▲ Show 20 Lines • Show All 153 Lines • ▼ Show 20 Lines	static const char *kindToOpSymbol(Kind kind) {
case kCastSF:		case kCastSF:
case kCastUF:		case kCastUF:
case kCastS:		case kCastS:
case kCastU:		case kCastU:
case kCastIdx:		case kCastIdx:
case kTruncI:		case kTruncI:
case kBitCast:		case kBitCast:
return "cast";		return "cast";
		case kUnary:
		aartbikUnsubmitted Done Reply Inline Actions please follow same order as in enum decl. aartbik: please follow same order as in enum decl.
		return "unary";
case kMulF:		case kMulF:
return "*";		return "*";
case kMulI:		case kMulI:
return "*";		return "*";
case kDivF:		case kDivF:
return "/";		return "/";
case kDivS:		case kDivS:
return "/";		return "/";
Show All 14 Lines	static const char *kindToOpSymbol(Kind kind) {
case kXorI:		case kXorI:
return "^";		return "^";
case kShrS:		case kShrS:
return "a>>";		return "a>>";
case kShrU:		case kShrU:
return ">>";		return ">>";
case kShlI:		case kShlI:
return "<<";		return "<<";
		case kBinary:
		return "binary";
}		}
llvm_unreachable("unexpected kind for symbol");		llvm_unreachable("unexpected kind for symbol");
}		}

void Merger::dumpExp(unsigned e) const {		void Merger::dumpExp(unsigned e) const {
switch (tensorExps[e].kind) {		switch (tensorExps[e].kind) {
case kTensor:		case kTensor:
if (tensorExps[e].tensor == syntheticTensor)		if (tensorExps[e].tensor == syntheticTensor)
▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines
}		}

#endif // NDEBUG		#endif // NDEBUG

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Builder methods.		// Builder methods.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

unsigned Merger::buildLattices(unsigned e, unsigned i) {		unsigned Merger::buildLattices(unsigned e, unsigned i, unsigned z) {
Kind kind = tensorExps[e].kind;		Kind kind = tensorExps[e].kind;
switch (kind) {		switch (kind) {
case kTensor:		case kTensor:
case kInvariant:		case kInvariant:
case kIndex: {		case kIndex: {
// Either the index is really used in the tensor expression, or it is		// Either the index is really used in the tensor expression, or it is
// set to the undefined index in that dimension. An invariant expression,		// set to the undefined index in that dimension. An invariant expression,
// a proper index value, and a truly dynamic sparse output tensor are set		// a proper index value, and a truly dynamic sparse output tensor are set
Show All 26 Lines	unsigned Merger::buildLattices(unsigned e, unsigned i, unsigned z) {
case kTruncI:		case kTruncI:
case kBitCast:		case kBitCast:
// A zero preserving operation (viz. f(0) = 0, [Bik96,Ch5]) maps the		// A zero preserving operation (viz. f(0) = 0, [Bik96,Ch5]) maps the
// lattice set of the operand through the operator into a new set.		// lattice set of the operand through the operator into a new set.
//		//
// -y\|!y \| y \|		// -y\|!y \| y \|
// --+---+---+		// --+---+---+
// \| 0 \|-y \|		// \| 0 \|-y \|
return mapSet(kind, buildLattices(tensorExps[e].children.e0, i),		return mapSet(kind, buildLattices(tensorExps[e].children.e0, i, z),
tensorExps[e].val);		tensorExps[e].val);
		case kUnary:
		// A custom unary operation
		//
		// op y\| !y \| y \|
		// ----+----------+------------+
		// \| absent() \| present(y) \|
		{
		UnaryOp unop = dyn_cast<UnaryOp>(tensorExps[e].operation);
		assert(unop);
		aartbikUnsubmitted Done Reply Inline Actions just use cast (not dyn_cast) and no assert, since this should alway work aartbik: just use cast (not dyn_cast) and no assert, since this should alway work
		Region &presentRegion = unop.presentRegion();
		Region &absentRegion = unop.absentRegion();
		aartbikUnsubmitted Done Reply Inline Actions splinter? aartbik: splinter?
		unsigned child0 = buildLattices(tensorExps[e].children.e0, i, z);

		aartbikUnsubmitted Done Reply Inline Actions note that this could be a simple cast, not a dyn_cast, but more preferable, it looks like we could simply keep a single kUnaryRegion case, see below aartbik: note that this could be a simple cast, not a dyn_cast, but more preferable, it looks like we…
		if (z != 1) {
		aartbikUnsubmitted Done Reply Inline Actions I patched in your revision, but got compilation errors here. I think you need to include #include "mlir/Dialect/SparseTensor/IR/SparseTensor.h" aartbik: I patched in your revision, but got compilation errors here. I think you need to include…
		jim22kAuthorUnsubmitted Done Reply Inline Actions That should already be there. See line 11. jim22k: That should already be there. See line 11.
		// When z == 1, this will be resolved correctly.
		return mapSet(kind, child0, Value(), unop);
		}

		Operation *presentYield = nullptr;
		if (!presentRegion.empty()) {
		aartbikUnsubmitted Done Reply Inline Actions same here and below, if the cast should work, just cast aartbik: same here and below, if the cast should work, just cast
		Block &presentBlock = presentRegion.front();
		presentYield = presentBlock.getTerminator();
		}
		if (absentRegion.empty()) {
		// Simple mapping over existing values
		aartbikUnsubmitted Done Reply Inline Actions period at end aartbik: period at end
		return mapSet(kind, child0, Value(), presentYield);
		} else {
		// Use a disjunction with `unop` on the left and the absent value as an
		// invariant on the right
		jim22kAuthorUnsubmitted Done Reply Inline Actions I need help here. I'm not sure what I need to include such that `absentVal` is added to the output for every missing value in the input. I assume this is something similar to kTensor or kInvariant, as doing `v+1` inside linalg.generic will create a dense output. I need something similar to happen here. jim22k: I need help here. I'm not sure what I need to include such that `absentVal` is added to the…
		aartbikUnsubmitted Done Reply Inline Actions This is indeed a bit trickier. For the time being, to focus on the overall logic, let's simply put this case under the same as all the other zero preserving unary ops, and assert that absent() is not set. Then we will enhance the logic later. aartbik: This is indeed a bit trickier. For the time being, to focus on the overall logic, let's simply…
		jim22kAuthorUnsubmitted Done Reply Inline Actions I figured out how to make the `absent` region work. I essentially treat it like `v + 1` (which is a binary function). Using the same logic, I perform a disjunction. The overlap will be replaced by the `present` block which only has 1 input argument, so it will sub in the left argument rather than being truly binary. The rhs is a fixed value, so I create a kInvariant for it and it covers everything which is not a conjunction. jim22k: I figured out how to make the `absent` region work. I essentially treat it like `v + 1` (which…
		Block &absentBlock = absentRegion.front();
		YieldOp absentYield = dyn_cast<YieldOp>(absentBlock.getTerminator());
		Value absentVal = absentYield.result();
		unsigned rhs = addExp(kInvariant, absentVal);
		return takeDisj(kind, child0, buildLattices(rhs, i, z), presentYield);
		}
		}
case kMulF:		case kMulF:
case kMulI:		case kMulI:
case kAndI:		case kAndI:
// A multiplicative operation only needs to be performed		// A multiplicative operation only needs to be performed
// for the conjunction of sparse iteration spaces.		// for the conjunction of sparse iteration spaces.
//		//
// x*y\|!y \| y \|		// x*y\|!y \| y \|
// ---+---+---+		// ---+---+---+
// !x \| 0 \| 0 \|		// !x \| 0 \| 0 \|
// x \| 0 \|x*y\|		// x \| 0 \|x*y\|
return takeConj(kind, // take binary conjunction		return takeConj(kind, // take binary conjunction
buildLattices(tensorExps[e].children.e0, i),		buildLattices(tensorExps[e].children.e0, i, z),
buildLattices(tensorExps[e].children.e1, i));		buildLattices(tensorExps[e].children.e1, i, z));
case kDivF:		case kDivF:
case kDivS:		case kDivS:
case kDivU:		case kDivU:
// A division is tricky, since 0/0, 0/c, c/0 all have		// A division is tricky, since 0/0, 0/c, c/0 all have
// specific outcomes for floating-point and integers.		// specific outcomes for floating-point and integers.
// Thus, we need to traverse the full iteration space.		// Thus, we need to traverse the full iteration space.
//		//
// x/y\|!y \| y \|		// x/y\|!y \| y \|
// ---+---+---+		// ---+---+---+
// !x \|0/0\|0/y\| FP: 0/0=NaN,c/0=Inf,0/c=0 with c true nonzero		// !x \|0/0\|0/y\| FP: 0/0=NaN,c/0=Inf,0/c=0 with c true nonzero
// x \|x/0\|x/y\| INT: x/0=exception for any x		// x \|x/0\|x/y\| INT: x/0=exception for any x
//		//
// TODO: for now we "fixed" this by only accepting x/c cases		// TODO: for now we "fixed" this by only accepting x/c cases
// during expression building, so that the conjunction		// during expression building, so that the conjunction
// rules applies (viz. x/c = x*(1/c) as far as lattice		// rules applies (viz. x/c = x*(1/c) as far as lattice
// construction is concerned).		// construction is concerned).
assert(!maybeZero(tensorExps[e].children.e1));		assert(!maybeZero(tensorExps[e].children.e1));
return takeConj(kind, // take binary conjunction		return takeConj(kind, // take binary conjunction
buildLattices(tensorExps[e].children.e0, i),		buildLattices(tensorExps[e].children.e0, i, z),
buildLattices(tensorExps[e].children.e1, i));		buildLattices(tensorExps[e].children.e1, i, z));
case kAddF:		case kAddF:
case kAddI:		case kAddI:
case kSubF:		case kSubF:
case kSubI:		case kSubI:
case kOrI:		case kOrI:
case kXorI:		case kXorI:
// An additive operation needs to be performed		// An additive operation needs to be performed
// for the disjunction of sparse iteration spaces.		// for the disjunction of sparse iteration spaces.
//		//
// x+y\|!y \| y \| x-y\|!y \| y \|		// x+y\|!y \| y \| x-y\|!y \| y \|
// ---+---+---+ ---+---+---+		// ---+---+---+ ---+---+---+
// !x \| 0 \| y \| !x \| 0 \|-y \|		// !x \| 0 \| y \| !x \| 0 \|-y \|
// x \| x \|x+y\| x \| x \|x-y\|		// x \| x \|x+y\| x \| x \|x-y\|
return takeDisj(kind, // take binary disjunction		return takeDisj(kind, // take binary disjunction
buildLattices(tensorExps[e].children.e0, i),		buildLattices(tensorExps[e].children.e0, i, z),
buildLattices(tensorExps[e].children.e1, i));		buildLattices(tensorExps[e].children.e1, i, z));
case kShrS:		case kShrS:
case kShrU:		case kShrU:
case kShlI:		case kShlI:
// A shift operation by an invariant amount (viz. tensor expressions		// A shift operation by an invariant amount (viz. tensor expressions
// can only occur at the left-hand-side of the operator) can be handled		// can only occur at the left-hand-side of the operator) can be handled
// with the conjuction rule.		// with the conjuction rule.
assert(isInvariant(tensorExps[e].children.e1));		assert(isInvariant(tensorExps[e].children.e1));
return takeConj(kind, // take binary conjunction		return takeConj(kind, // take binary conjunction
buildLattices(tensorExps[e].children.e0, i),		buildLattices(tensorExps[e].children.e0, i, z),
buildLattices(tensorExps[e].children.e1, i));		buildLattices(tensorExps[e].children.e1, i, z));
		case kBinary:
		// A custom binary operation
		//
		jim22kAuthorUnsubmitted Done Reply Inline Actions I'm stuck here. The `kBinary` below creates one kBinaryRegion and two kUnaryRegions. These are not called/handled for the vector tests. But for the matrix test, the kBinaryRegion is re-evaluated for some reason. The problem is that the origin `sparse_tensor.binary` operation has already been split apart. I essentially need a no-op here (i.e. it's already working fine. Don't re-evaluate). I tried calling `takeConj` again, but that somehow eliminates the disjoint pieces that were added in the kBinary section. jim22k: I'm stuck here. The `kBinary` below creates one kBinaryRegion and two kUnaryRegions. These are…
		aartbikUnsubmitted Done Reply Inline Actions If you rewrite it as suggested below, I think the no-op comes natural. aartbik: If you rewrite it as suggested below, I think the no-op comes natural.
		// x op y\| !y \| y \|
		// ------+---------+--------------+
		// !x \| empty \| right(y) \|
		// x \| left(x) \| overlap(x,y) \|
		{
		BinaryOp binop = dyn_cast<BinaryOp>(tensorExps[e].operation);
		assert(binop);
		Region &overlapRegion = binop.overlapRegion();
		Region &leftRegion = binop.leftRegion();
		Region &rightRegion = binop.rightRegion();
		unsigned child0 = buildLattices(tensorExps[e].children.e0, i, z);
		unsigned child1 = buildLattices(tensorExps[e].children.e1, i, z);

		if (z != 1) {
		jim22kAuthorUnsubmitted Done Reply Inline Actions I don't know if this is the correct approach, but it makes the matrix tests all pass. We only split up the binary operation once and we never need to return a no-op. At least, that will hold as long as z==1 for the last time we try to buildLattices() on the binary operation. jim22k: I don't know if this is the correct approach, but it makes the matrix tests all pass. We only…
		// When z == 1, this will be resolved correctly.
		return takeConj(kind, buildLattices(tensorExps[e].children.e0, i, z),
		buildLattices(tensorExps[e].children.e1, i, z), binop);
		aartbikUnsubmitted Done Reply Inline Actions same, should be a cast here, but it looks like we could simply keep a single kBinaryRegion case by looking at the result of this cast (which then needs to be dynamic again ;-) and decide what to do. I think I prefer that a bit more than artificially introducing the kBinary/kUnary as "handled" cases. aartbik: same, should be a cast here, but it looks like we could simply keep a single kBinaryRegion…
		jim22kAuthorUnsubmitted Done Reply Inline Actions That seems reasonable. I will refactor and check the result of dyn_cast. If it fails, it means the BinaryOp has already been handled. `buildLattices` has unsigned return type, so what should I return in that case? This is where I want a no-op, meaning "don't add any new lattices points". jim22k: That seems reasonable. I will refactor and check the result of dyn_cast. If it fails, it means…
		aartbikUnsubmitted Done Reply Inline Actions If at all possible, I would like the "nop" to be detected at a higher level, i.e. before building the new lattices. Otherwise we would have to somehow return the existing set id (returning a nop value is a bit too intrusive to my taste). But after you have done the restructuring suggested here, I will have another look. aartbik: If at all possible, I would like the "nop" to be detected at a higher level, i.e. before…
		}

		// Overlap Region
		Operation *overlapYield = nullptr;
		if (!overlapRegion.empty()) {
		Block &overlapBlock = overlapRegion.front();
		overlapYield = overlapBlock.getTerminator();
		aartbikUnsubmitted Done Reply Inline Actions it feels this whole block of code, L612 to L639 is really takeDisj with some smart selection of the branches. Perhaps you can split out the analysis of the MLIR IR (getting the three branches), and then write a new takeDisj(... , opboth, opleft, opright) and put the takeDisj close to the other, just so that the actual lattic logic is not so deeply burried inside this huge block aartbik: it feels this whole block of code, L612 to L639 is really takeDisj with some smart selection of…
		jim22kAuthorUnsubmitted Done Reply Inline Actions Good idea to move this up near the other takeDisj(). jim22k: Good idea to move this up near the other takeDisj().
		}
		// Left Region
		aartbikUnsubmitted Done Reply Inline Actions period at end aartbik: period at end
		Operation *leftYield = nullptr;
		if (!leftRegion.empty()) {
		Block &leftBlock = leftRegion.front();
		leftYield = leftBlock.getTerminator();
		}
		// Right Region
		Operation *rightYield = nullptr;
		if (!rightRegion.empty()) {
		Block &rightBlock = rightRegion.front();
		rightYield = rightBlock.getTerminator();
		}
		return takeDisj(kind, child0, child1,
		binop.left_identity() \|\| !leftRegion.empty(),
		binop.right_identity() \|\| !rightRegion.empty(),
		overlapYield, leftYield, rightYield);
		}
		jim22kAuthorUnsubmitted Done Reply Inline Actions I created the new takeDisj, which is very nice. It does mean that all of the split up pieces of kBinary remain labeled as kBinary, even for `left` and `right` which only have a single input argument. jim22k: I created the new takeDisj, which is very nice. It does mean that all of the split up pieces of…
}		}
llvm_unreachable("unexpected expression kind");		llvm_unreachable("unexpected expression kind");
}		}

Optional<unsigned> Merger::buildTensorExpFromLinalg(linalg::GenericOp op) {		Optional<unsigned> Merger::buildTensorExpFromLinalg(linalg::GenericOp op) {
Operation *yield = op.region().front().getTerminator();		Operation *yield = op.region().front().getTerminator();
return buildTensorExp(op, yield->getOperand(0));		return buildTensorExp(op, yield->getOperand(0));
}		}
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	if (x.hasValue()) {
if (isa<arith::ExtUIOp>(def))		if (isa<arith::ExtUIOp>(def))
return addExp(kCastU, e, v);		return addExp(kCastU, e, v);
if (isa<arith::IndexCastOp>(def))		if (isa<arith::IndexCastOp>(def))
return addExp(kCastIdx, e, v);		return addExp(kCastIdx, e, v);
if (isa<arith::TruncIOp>(def))		if (isa<arith::TruncIOp>(def))
return addExp(kTruncI, e, v);		return addExp(kTruncI, e, v);
if (isa<arith::BitcastOp>(def))		if (isa<arith::BitcastOp>(def))
return addExp(kBitCast, e, v);		return addExp(kBitCast, e, v);
		if (isa<sparse_tensor::UnaryOp>(def)) {
		aartbikUnsubmitted Done Reply Inline Actions At first reading, it was surprising to just see "UnaryOp" here, since that looks like just another arith:: version. Prefixing it with namespace sparse_tensor:: is of course against styleguide, but would look more clear. Perhaps we should rename the Unary/BinaryOp of sparse tensor dialect a bit more specific (can be done later). aartbik: At first reading, it was surprising to just see "UnaryOp" here, since that looks like just…
		aartbikUnsubmitted Done Reply Inline Actions braces not needed aartbik: braces not needed
		return addExp(kUnary, e, Value(), def);
		}
}		}
}		}
// Construct binary operations if subexpressions can be built.		// Construct binary operations if subexpressions can be built.
// See buildLattices() for an explanation of rejecting certain		// See buildLattices() for an explanation of rejecting certain
// division and shift operations		// division and shift operations
if (def->getNumOperands() == 2) {		if (def->getNumOperands() == 2) {
auto x = buildTensorExp(op, def->getOperand(0));		auto x = buildTensorExp(op, def->getOperand(0));
auto y = buildTensorExp(op, def->getOperand(1));		auto y = buildTensorExp(op, def->getOperand(1));
Show All 25 Lines	if (x.hasValue() && y.hasValue()) {
if (isa<arith::XOrIOp>(def))		if (isa<arith::XOrIOp>(def))
return addExp(kXorI, e0, e1);		return addExp(kXorI, e0, e1);
if (isa<arith::ShRSIOp>(def) && isInvariant(e1))		if (isa<arith::ShRSIOp>(def) && isInvariant(e1))
return addExp(kShrS, e0, e1);		return addExp(kShrS, e0, e1);
if (isa<arith::ShRUIOp>(def) && isInvariant(e1))		if (isa<arith::ShRUIOp>(def) && isInvariant(e1))
return addExp(kShrU, e0, e1);		return addExp(kShrU, e0, e1);
if (isa<arith::ShLIOp>(def) && isInvariant(e1))		if (isa<arith::ShLIOp>(def) && isInvariant(e1))
return addExp(kShlI, e0, e1);		return addExp(kShlI, e0, e1);
		if (isa<sparse_tensor::BinaryOp>(def)) {
		aartbikUnsubmitted Done Reply Inline Actions braces not needed aartbik: braces not needed
		return addExp(kBinary, e0, e1, Value(), def);
		}
}		}
}		}
// Cannot build.		// Cannot build.
return None;		return None;
}		}

Value Merger::buildExp(PatternRewriter &rewriter, Location loc, unsigned e,		Value Merger::buildExp(PatternRewriter &rewriter, Location loc, unsigned e,
		aartbikUnsubmitted Done Reply Inline Actions mark these new helpers as "static" since they are private to the file aartbik: mark these new helpers as "static" since they are private to the file
Value v0, Value v1) {		Value v0, Value v1) {
switch (tensorExps[e].kind) {		switch (tensorExps[e].kind) {
case kTensor:		case kTensor:
case kInvariant:		case kInvariant:
case kIndex:		case kIndex:
llvm_unreachable("unexpected non-op");		llvm_unreachable("unexpected non-op");
// Unary ops.		// Unary ops.
		aartbikUnsubmitted Done Reply Inline Actions cast, we assume that verifier has filtered out bad cases aartbik: cast, we assume that verifier has filtered out bad cases
case kAbsF:		case kAbsF:
return rewriter.create<math::AbsOp>(loc, v0);		return rewriter.create<math::AbsOp>(loc, v0);
case kCeilF:		case kCeilF:
return rewriter.create<math::CeilOp>(loc, v0);		return rewriter.create<math::CeilOp>(loc, v0);
case kFloorF:		case kFloorF:
return rewriter.create<math::FloorOp>(loc, v0);		return rewriter.create<math::FloorOp>(loc, v0);
case kNegF:		case kNegF:
return rewriter.create<arith::NegFOp>(loc, v0);		return rewriter.create<arith::NegFOp>(loc, v0);
Show All 22 Lines	Value Merger::buildExp(PatternRewriter &rewriter, Location loc, unsigned e,
case kCastIdx:		case kCastIdx:
return rewriter.create<arith::IndexCastOp>(loc, inferType(e, v0), v0);		return rewriter.create<arith::IndexCastOp>(loc, inferType(e, v0), v0);
case kTruncI:		case kTruncI:
return rewriter.create<arith::TruncIOp>(loc, inferType(e, v0), v0);		return rewriter.create<arith::TruncIOp>(loc, inferType(e, v0), v0);
case kBitCast:		case kBitCast:
return rewriter.create<arith::BitcastOp>(loc, inferType(e, v0), v0);		return rewriter.create<arith::BitcastOp>(loc, inferType(e, v0), v0);
// Binary ops.		// Binary ops.
case kMulF:		case kMulF:
return rewriter.create<arith::MulFOp>(loc, v0, v1);		return rewriter.create<arith::MulFOp>(loc, v0, v1);
case kMulI:		case kMulI:
		aartbikUnsubmitted Done Reply Inline Actions what does it mean if we hit this part? an empty block? or an error? aartbik: what does it mean if we hit this part? an empty block? or an error?
		jim22kAuthorUnsubmitted Done Reply Inline Actions This means an empty block, so I want to indicate no value. jim22k: This means an empty block, so I want to indicate no value.
return rewriter.create<arith::MulIOp>(loc, v0, v1);		return rewriter.create<arith::MulIOp>(loc, v0, v1);
case kDivF:		case kDivF:
return rewriter.create<arith::DivFOp>(loc, v0, v1);		return rewriter.create<arith::DivFOp>(loc, v0, v1);
case kDivS:		case kDivS:
return rewriter.create<arith::DivSIOp>(loc, v0, v1);		return rewriter.create<arith::DivSIOp>(loc, v0, v1);
case kDivU:		case kDivU:
return rewriter.create<arith::DivUIOp>(loc, v0, v1);		return rewriter.create<arith::DivUIOp>(loc, v0, v1);
case kAddF:		case kAddF:
Show All 11 Lines	Value Merger::buildExp(PatternRewriter &rewriter, Location loc, unsigned e,
case kXorI:		case kXorI:
return rewriter.create<arith::XOrIOp>(loc, v0, v1);		return rewriter.create<arith::XOrIOp>(loc, v0, v1);
case kShrS:		case kShrS:
return rewriter.create<arith::ShRSIOp>(loc, v0, v1);		return rewriter.create<arith::ShRSIOp>(loc, v0, v1);
case kShrU:		case kShrU:
return rewriter.create<arith::ShRUIOp>(loc, v0, v1);		return rewriter.create<arith::ShRUIOp>(loc, v0, v1);
case kShlI:		case kShlI:
return rewriter.create<arith::ShLIOp>(loc, v0, v1);		return rewriter.create<arith::ShLIOp>(loc, v0, v1);
		// Set-like ops with custom logic.
		case kUnary:
		case kBinary: {
		jim22kAuthorUnsubmitted Done Reply Inline Actions These can be combined in the switch because their logic is identical. The binary operation splits up into pieces which may have 1 or 2 input arguments, so they look just like the unary operation being split up. jim22k: These can be combined in the switch because their logic is identical. The binary operation…
		Operation *op = tensorExps[e].operation;
		aartbikUnsubmitted Done Reply Inline Actions The unary and binary codegen blocks are a bit too large for this context (most others are oneliners) so please move into own method aartbik: The unary and binary codegen blocks are a bit too large for this context (most others are…
		if (!op)
		return Value();
		// Make a clone of the block
		aartbikUnsubmitted Done Reply Inline Actions period at end aartbik: period at end
		Region tmpRegion;
		BlockAndValueMapping mapper;
		op->getBlock()->getParent()->cloneInto(&tmpRegion, tmpRegion.begin(),
		mapper);
		Block &clonedBlock = tmpRegion.front();
		YieldOp clonedYield = dyn_cast<YieldOp>(clonedBlock.getTerminator());
		// Merge cloned block and return yield value
		aartbikUnsubmitted Done Reply Inline Actions period at end aartbik: period at end
		Operation *placeholder = rewriter.create<arith::ConstantIndexOp>(loc, 0);
		if (clonedBlock.getNumArguments() == 2) {
		if (!v0 or !v1)
		aartbikUnsubmitted Done Reply Inline Actions here and below, document this shortcut returns aartbik: here and below, document this shortcut returns
		return Value();
		rewriter.mergeBlockBefore(&tmpRegion.front(), placeholder, {v0, v1});
		} else {
		if (!v0)
		return Value();
		rewriter.mergeBlockBefore(&tmpRegion.front(), placeholder, {v0});
		}
		Value val = clonedYield.result();
		rewriter.eraseOp(clonedYield);
		rewriter.eraseOp(placeholder);
		return val;
		}
}		}
llvm_unreachable("unexpected expression kind in build");		llvm_unreachable("unexpected expression kind in build");
}		}

} // namespace sparse_tensor		} // namespace sparse_tensor
} // namespace mlir		} // namespace mlir

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	%0 = linalg.generic #trait_scale
outs(%xm: tensor<?x?xf64, #DCSR>) {		outs(%xm: tensor<?x?xf64, #DCSR>) {
^bb(%a: f64, %x: f64):		^bb(%a: f64, %x: f64):
%1 = arith.mulf %a, %s : f64		%1 = arith.mulf %a, %s : f64
linalg.yield %1 : f64		linalg.yield %1 : f64
} -> tensor<?x?xf64, #DCSR>		} -> tensor<?x?xf64, #DCSR>
return %0 : tensor<?x?xf64, #DCSR>		return %0 : tensor<?x?xf64, #DCSR>
}		}

		// Clips values to the range [3, 7]
		aartbikUnsubmitted Done Reply Inline Actions Period at end aartbik: Period at end
		func @matrix_clip(%argx: tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR> {
		aartbikUnsubmitted Done Reply Inline Actions Rather than adding and modifying an existing test, I would much rather that you add two new integration tests: unary and binary. Yes, it has some boiler plate to setup inputs, but that way it is more clear where the semi-ring stuff is tested. aartbik: Rather than adding and modifying an existing test, I would much rather that you add two new…
		%c0 = arith.constant 0 : index
		%c1 = arith.constant 1 : index
		%cfmin = arith.constant 3.0 : f64
		%cfmax = arith.constant 7.0 : f64
		%d0 = tensor.dim %argx, %c0 : tensor<?x?xf64, #DCSR>
		%d1 = tensor.dim %argx, %c1 : tensor<?x?xf64, #DCSR>
		%xv = sparse_tensor.init [%d0, %d1] : tensor<?x?xf64, #DCSR>
		%0 = linalg.generic #trait_scale
		ins(%argx: tensor<?x?xf64, #DCSR>)
		outs(%xv: tensor<?x?xf64, #DCSR>) {
		^bb(%a: f64, %x: f64):
		%1 = sparse_tensor.unary %a: f64 to f64
		present={
		^bb0(%x0: f64):
		%mincmp = arith.cmpf "ogt", %x0, %cfmin : f64
		%x1 = arith.select %mincmp, %x0, %cfmin : f64
		%maxcmp = arith.cmpf "olt", %x1, %cfmax : f64
		%x2 = arith.select %maxcmp, %x1, %cfmax : f64
		sparse_tensor.yield %x2 : f64
		}
		absent={}
		linalg.yield %1 : f64
		} -> tensor<?x?xf64, #DCSR>
		return %0 : tensor<?x?xf64, #DCSR>
		}

// Scales a sparse matrix in place.		// Scales a sparse matrix in place.
func @matrix_scale_inplace(%argx: tensor<?x?xf64, #DCSR>		func @matrix_scale_inplace(%argx: tensor<?x?xf64, #DCSR>
{linalg.inplaceable = true}) -> tensor<?x?xf64, #DCSR> {		{linalg.inplaceable = true}) -> tensor<?x?xf64, #DCSR> {
%s = arith.constant 2.0 : f64		%s = arith.constant 2.0 : f64
%0 = linalg.generic #trait_scale_inpl		%0 = linalg.generic #trait_scale_inpl
outs(%argx: tensor<?x?xf64, #DCSR>) {		outs(%argx: tensor<?x?xf64, #DCSR>) {
^bb(%x: f64):		^bb(%x: f64):
%1 = arith.mulf %x, %s : f64		%1 = arith.mulf %x, %s : f64
Show All 33 Lines	%0 = linalg.generic #trait_op
outs(%xv: tensor<?x?xf64, #DCSR>) {		outs(%xv: tensor<?x?xf64, #DCSR>) {
^bb(%a: f64, %b: f64, %x: f64):		^bb(%a: f64, %b: f64, %x: f64):
%1 = arith.mulf %a, %b : f64		%1 = arith.mulf %a, %b : f64
linalg.yield %1 : f64		linalg.yield %1 : f64
} -> tensor<?x?xf64, #DCSR>		} -> tensor<?x?xf64, #DCSR>
return %0 : tensor<?x?xf64, #DCSR>		return %0 : tensor<?x?xf64, #DCSR>
}		}

		// Adds two sparse matrices when they intersect. Where they don't intersect,
		// negate the 2nd argument's values and don't include the 1st argument's values.
		func @matrix_intersect(%arga: tensor<?x?xf64, #DCSR>,
		jim22kAuthorUnsubmitted Done Reply Inline Actions This test fails with an assertion error: `Assertion failed: (!add \|\| latGT(p0, p1)), function optimizeSet, file Merger.cpp, line 167.` jim22k: This test fails with an assertion error: `Assertion failed: (!add \|\| latGT(p0, p1)), function…
		jim22kAuthorUnsubmitted Done Reply Inline Actions This is now passing. jim22k: This is now passing.
		%argb: tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR> {
		%c0 = arith.constant 0 : index
		%c1 = arith.constant 1 : index
		%d0 = tensor.dim %arga, %c0 : tensor<?x?xf64, #DCSR>
		%d1 = tensor.dim %arga, %c1 : tensor<?x?xf64, #DCSR>
		%xv = sparse_tensor.init [%d0, %d1] : tensor<?x?xf64, #DCSR>
		%0 = linalg.generic #trait_op
		ins(%arga, %argb: tensor<?x?xf64, #DCSR>, tensor<?x?xf64, #DCSR>)
		outs(%xv: tensor<?x?xf64, #DCSR>) {
		^bb(%a: f64, %b: f64, %x: f64):
		%1 = sparse_tensor.binary %a, %b: f64, f64 to f64
		overlap={
		^bb0(%x0: f64, %y0: f64):
		%ret = arith.addf %x0, %y0 : f64
		sparse_tensor.yield %ret : f64
		}
		left={}
		right={
		^bb0(%x1: f64):
		%lret = arith.negf %x1 : f64
		sparse_tensor.yield %lret : f64
		}
		linalg.yield %1 : f64
		} -> tensor<?x?xf64, #DCSR>
		return %0 : tensor<?x?xf64, #DCSR>
		}

// Dump a sparse matrix.		// Dump a sparse matrix.
func @dump(%arg0: tensor<?x?xf64, #DCSR>) {		func @dump(%arg0: tensor<?x?xf64, #DCSR>) {
%d0 = arith.constant 0.0 : f64		%d0 = arith.constant 0.0 : f64
%c0 = arith.constant 0 : index		%c0 = arith.constant 0 : index
%dm = sparse_tensor.convert %arg0 : tensor<?x?xf64, #DCSR> to tensor<?x?xf64>		%dm = sparse_tensor.convert %arg0 : tensor<?x?xf64, #DCSR> to tensor<?x?xf64>
%0 = bufferization.to_memref %dm : memref<?x?xf64>		%0 = bufferization.to_memref %dm : memref<?x?xf64>
%1 = vector.transfer_read %0[%c0, %c0], %d0: memref<?x?xf64>, vector<4x8xf64>		%1 = vector.transfer_read %0[%c0, %c0], %d0: memref<?x?xf64>, vector<4x8xf64>
vector.print %1 : vector<4x8xf64>		vector.print %1 : vector<4x8xf64>
Show All 16 Lines	%m2 = arith.constant sparse<
[6.0, 5.0, 4.0, 3.0, 2.0, 1.0 ]		[6.0, 5.0, 4.0, 3.0, 2.0, 1.0 ]
> : tensor<4x8xf64>		> : tensor<4x8xf64>
%sm1 = sparse_tensor.convert %m1 : tensor<4x8xf64> to tensor<?x?xf64, #DCSR>		%sm1 = sparse_tensor.convert %m1 : tensor<4x8xf64> to tensor<?x?xf64, #DCSR>
%sm2 = sparse_tensor.convert %m2 : tensor<4x8xf64> to tensor<?x?xf64, #DCSR>		%sm2 = sparse_tensor.convert %m2 : tensor<4x8xf64> to tensor<?x?xf64, #DCSR>

// Call sparse vector kernels.		// Call sparse vector kernels.
%0 = call @matrix_scale(%sm1)		%0 = call @matrix_scale(%sm1)
: (tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>		: (tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>
%1 = call @matrix_scale_inplace(%sm1)		%1 = call @matrix_clip(%sm1)
		: (tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>
		%2 = call @matrix_scale_inplace(%sm1)
: (tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>		: (tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>
%2 = call @matrix_add(%sm1, %sm2)		%3 = call @matrix_add(%sm1, %sm2)
		: (tensor<?x?xf64, #DCSR>, tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>
		%4 = call @matrix_mul(%sm1, %sm2)
: (tensor<?x?xf64, #DCSR>, tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>		: (tensor<?x?xf64, #DCSR>, tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>
%3 = call @matrix_mul(%sm1, %sm2)		%5 = call @matrix_intersect(%sm1, %sm2)
: (tensor<?x?xf64, #DCSR>, tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>		: (tensor<?x?xf64, #DCSR>, tensor<?x?xf64, #DCSR>) -> tensor<?x?xf64, #DCSR>

//		//
// Verify the results.		// Verify the results.
//		//
// CHECK: ( ( 2, 4, 0, 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0, 0, 0, 6 ), ( 0, 0, 8, 0, 10, 0, 0, 12 ), ( 14, 0, 16, 18, 0, 0, 0, 0 ) )		// CHECK: ( ( 2, 4, 0, 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0, 0, 0, 6 ), ( 0, 0, 8, 0, 10, 0, 0, 12 ), ( 14, 0, 16, 18, 0, 0, 0, 0 ) )
// CHECK-NEXT: ( ( 6, 0, 0, 0, 0, 0, 0, 5 ), ( 4, 0, 0, 0, 0, 0, 3, 0 ), ( 0, 2, 0, 0, 0, 0, 0, 1 ), ( 0, 0, 0, 0, 0, 0, 0, 0 ) )		// CHECK-NEXT: ( ( 6, 0, 0, 0, 0, 0, 0, 5 ), ( 4, 0, 0, 0, 0, 0, 3, 0 ), ( 0, 2, 0, 0, 0, 0, 0, 1 ), ( 0, 0, 0, 0, 0, 0, 0, 0 ) )
// CHECK-NEXT: ( ( 2, 4, 0, 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0, 0, 0, 6 ), ( 0, 0, 8, 0, 10, 0, 0, 12 ), ( 14, 0, 16, 18, 0, 0, 0, 0 ) )		// CHECK-NEXT: ( ( 2, 4, 0, 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0, 0, 0, 6 ), ( 0, 0, 8, 0, 10, 0, 0, 12 ), ( 14, 0, 16, 18, 0, 0, 0, 0 ) )
		// CHECK-NEXT: ( ( 3, 3, 0, 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0, 0, 0, 3 ), ( 0, 0, 4, 0, 5, 0, 0, 6 ), ( 7, 0, 7, 7, 0, 0, 0, 0 ) )
// CHECK-NEXT: ( ( 2, 4, 0, 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0, 0, 0, 6 ), ( 0, 0, 8, 0, 10, 0, 0, 12 ), ( 14, 0, 16, 18, 0, 0, 0, 0 ) )		// CHECK-NEXT: ( ( 2, 4, 0, 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0, 0, 0, 6 ), ( 0, 0, 8, 0, 10, 0, 0, 12 ), ( 14, 0, 16, 18, 0, 0, 0, 0 ) )
// CHECK-NEXT: ( ( 8, 4, 0, 0, 0, 0, 0, 5 ), ( 4, 0, 0, 0, 0, 0, 3, 6 ), ( 0, 2, 8, 0, 10, 0, 0, 13 ), ( 14, 0, 16, 18, 0, 0, 0, 0 ) )		// CHECK-NEXT: ( ( 8, 4, 0, 0, 0, 0, 0, 5 ), ( 4, 0, 0, 0, 0, 0, 3, 6 ), ( 0, 2, 8, 0, 10, 0, 0, 13 ), ( 14, 0, 16, 18, 0, 0, 0, 0 ) )
// CHECK-NEXT: ( ( 12, 0, 0, 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0, 0, 0, 12 ), ( 0, 0, 0, 0, 0, 0, 0, 0 ) )		// CHECK-NEXT: ( ( 12, 0, 0, 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0, 0, 0, 0 ), ( 0, 0, 0, 0, 0, 0, 0, 12 ), ( 0, 0, 0, 0, 0, 0, 0, 0 ) )
		// CHECK-NEXT: ( ( 8, 0, 0, 0, 0, 0, 0, -5 ), ( -4, 0, 0, 0, 0, 0, -3, 0 ), ( 0, -2, 0, 0, 0, 0, 0, 13 ), ( 0, 0, 0, 0, 0, 0, 0, 0 ) )
//		//
call @dump(%sm1) : (tensor<?x?xf64, #DCSR>) -> ()		call @dump(%sm1) : (tensor<?x?xf64, #DCSR>) -> ()
call @dump(%sm2) : (tensor<?x?xf64, #DCSR>) -> ()		call @dump(%sm2) : (tensor<?x?xf64, #DCSR>) -> ()
call @dump(%0) : (tensor<?x?xf64, #DCSR>) -> ()		call @dump(%0) : (tensor<?x?xf64, #DCSR>) -> ()
call @dump(%1) : (tensor<?x?xf64, #DCSR>) -> ()		call @dump(%1) : (tensor<?x?xf64, #DCSR>) -> ()
call @dump(%2) : (tensor<?x?xf64, #DCSR>) -> ()		call @dump(%2) : (tensor<?x?xf64, #DCSR>) -> ()
call @dump(%3) : (tensor<?x?xf64, #DCSR>) -> ()		call @dump(%3) : (tensor<?x?xf64, #DCSR>) -> ()
		call @dump(%4) : (tensor<?x?xf64, #DCSR>) -> ()
		call @dump(%5) : (tensor<?x?xf64, #DCSR>) -> ()

// Release the resources.		// Release the resources.
sparse_tensor.release %sm1 : tensor<?x?xf64, #DCSR>		sparse_tensor.release %sm1 : tensor<?x?xf64, #DCSR>
sparse_tensor.release %sm2 : tensor<?x?xf64, #DCSR>		sparse_tensor.release %sm2 : tensor<?x?xf64, #DCSR>
sparse_tensor.release %0 : tensor<?x?xf64, #DCSR>		sparse_tensor.release %0 : tensor<?x?xf64, #DCSR>
sparse_tensor.release %2 : tensor<?x?xf64, #DCSR>		sparse_tensor.release %1 : tensor<?x?xf64, #DCSR>
sparse_tensor.release %3 : tensor<?x?xf64, #DCSR>		sparse_tensor.release %3 : tensor<?x?xf64, #DCSR>
		sparse_tensor.release %4 : tensor<?x?xf64, #DCSR>
		sparse_tensor.release %5 : tensor<?x?xf64, #DCSR>
return		return
}		}
}		}

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	%0 = linalg.generic #trait_op
outs(%xv: tensor<?xf64, #SparseVector>) {		outs(%xv: tensor<?xf64, #SparseVector>) {
^bb(%a: f64, %b: f64, %x: f64):		^bb(%a: f64, %b: f64, %x: f64):
%1 = arith.addf %a, %b : f64		%1 = arith.addf %a, %b : f64
linalg.yield %1 : f64		linalg.yield %1 : f64
} -> tensor<?xf64, #SparseVector>		} -> tensor<?xf64, #SparseVector>
return %0 : tensor<?xf64, #SparseVector>		return %0 : tensor<?xf64, #SparseVector>
}		}

		// Creates a new sparse vector using the minimum values from two input sparse vectors.
		aartbikUnsubmitted Done Reply Inline Actions Same here. I would move these into the semi ring integration tests mentioned above, and simply add the novec/vec flags at the top (a test can have more than one RUN/CHECK series) aartbik: Same here. I would move these into the semi ring integration tests mentioned above, and simply…
		// When there is no overlap, include the present value in the output.
		func @vector_min(%arga: tensor<?xf64, #SparseVector>,
		jim22kAuthorUnsubmitted Done Reply Inline Actions This test passes. jim22k: This test passes.
		%argb: tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector> {
		%c = arith.constant 0 : index
		%d = tensor.dim %arga, %c : tensor<?xf64, #SparseVector>
		%xv = sparse_tensor.init [%d] : tensor<?xf64, #SparseVector>
		%0 = linalg.generic #trait_op
		ins(%arga, %argb: tensor<?xf64, #SparseVector>, tensor<?xf64, #SparseVector>)
		outs(%xv: tensor<?xf64, #SparseVector>) {
		^bb(%a: f64, %b: f64, %x: f64):
		%1 = sparse_tensor.binary %a, %b : f64, f64 to f64
		overlap={
		^bb0(%a0: f64, %b0: f64):
		%cmp = arith.cmpf "olt", %a0, %b0 : f64
		%2 = arith.select %cmp, %a0, %b0: f64
		sparse_tensor.yield %2 : f64
		}
		left=identity
		right=identity
		linalg.yield %1 : f64
		} -> tensor<?xf64, #SparseVector>
		return %0 : tensor<?xf64, #SparseVector>
		}

// Multiplies two sparse vectors into a new sparse vector.		// Multiplies two sparse vectors into a new sparse vector.
func @vector_mul(%arga: tensor<?xf64, #SparseVector>,		func @vector_mul(%arga: tensor<?xf64, #SparseVector>,
%argb: tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector> {		%argb: tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector> {
%c = arith.constant 0 : index		%c = arith.constant 0 : index
%d = tensor.dim %arga, %c : tensor<?xf64, #SparseVector>		%d = tensor.dim %arga, %c : tensor<?xf64, #SparseVector>
%xv = sparse_tensor.init [%d] : tensor<?xf64, #SparseVector>		%xv = sparse_tensor.init [%d] : tensor<?xf64, #SparseVector>
%0 = linalg.generic #trait_op		%0 = linalg.generic #trait_op
ins(%arga, %argb: tensor<?xf64, #SparseVector>, tensor<?xf64, #SparseVector>)		ins(%arga, %argb: tensor<?xf64, #SparseVector>, tensor<?xf64, #SparseVector>)
Show All 19 Lines	%0 = linalg.generic #trait_op
linalg.yield %1 : f64		linalg.yield %1 : f64
} -> tensor<?xf64, #DenseVector>		} -> tensor<?xf64, #DenseVector>
return %0 : tensor<?xf64, #DenseVector>		return %0 : tensor<?xf64, #DenseVector>
}		}

// Sum reduces dot product of two sparse vectors.		// Sum reduces dot product of two sparse vectors.
func @vector_dotprod(%arga: tensor<?xf64, #SparseVector>,		func @vector_dotprod(%arga: tensor<?xf64, #SparseVector>,
%argb: tensor<?xf64, #SparseVector>,		%argb: tensor<?xf64, #SparseVector>,
%argx: tensor<f64> {linalg.inplaceable = true}) -> tensor<f64> {		%argx: tensor<f64> {linalg.inplaceable = true}) -> tensor<f64> {
%0 = linalg.generic #trait_dot		%0 = linalg.generic #trait_dot
ins(%arga, %argb: tensor<?xf64, #SparseVector>, tensor<?xf64, #SparseVector>)		ins(%arga, %argb: tensor<?xf64, #SparseVector>, tensor<?xf64, #SparseVector>)
outs(%argx: tensor<f64>) {		outs(%argx: tensor<f64>) {
^bb(%a: f64, %b: f64, %x: f64):		^bb(%a: f64, %b: f64, %x: f64):
%1 = arith.mulf %a, %b : f64		%1 = arith.mulf %a, %b : f64
%2 = arith.addf %x, %1 : f64		%2 = arith.addf %x, %1 : f64
linalg.yield %2 : f64		linalg.yield %2 : f64
} -> tensor<f64>		} -> tensor<f64>
return %0 : tensor<f64>		return %0 : tensor<f64>
}		}

// Dumps a sparse vector.		// Take a set difference of two sparse vectors. The result will include only those
		// sparse elements present in the first, but not the second vector.
		func @vector_setdiff(%arga: tensor<?xf64, #SparseVector>,
		jim22kAuthorUnsubmitted Done Reply Inline Actions This test passes. jim22k: This test passes.
		%argb: tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector> {
		%c = arith.constant 0 : index
		%d = tensor.dim %arga, %c : tensor<?xf64, #SparseVector>
		%xv = sparse_tensor.init [%d] : tensor<?xf64, #SparseVector>
		%0 = linalg.generic #trait_op
		ins(%arga, %argb: tensor<?xf64, #SparseVector>, tensor<?xf64, #SparseVector>)
		outs(%xv: tensor<?xf64, #SparseVector>) {
		^bb(%a: f64, %b: f64, %x: f64):
		%1 = sparse_tensor.binary %a, %b : f64, f64 to f64
		overlap={}
		left=identity
		right={}
		linalg.yield %1 : f64
		} -> tensor<?xf64, #SparseVector>
		return %0 : tensor<?xf64, #SparseVector>
		}

		// Invert the structure of a sparse vector. Present values become missing.
		// Missing values are filled with 1 (i32).
		func @vector_complement(%arga: tensor<?xf64, #SparseVector>) -> tensor<?xi32, #SparseVector> {
		jim22kAuthorUnsubmitted Done Reply Inline Actions This test fails with an assertion error: `Assertion failed: (!add \|\| latGT(p0, p1)), function optimizeSet, file Merger.cpp, line 167.` jim22k: This test fails with an assertion error: `Assertion failed: (!add \|\| latGT(p0, p1)), function…
		jim22kAuthorUnsubmitted Done Reply Inline Actions This test is now passing. jim22k: This test is now passing.
		%c = arith.constant 0 : index
		%ci1 = arith.constant 1 : i32
		%d = tensor.dim %arga, %c : tensor<?xf64, #SparseVector>
		%xv = sparse_tensor.init [%d] : tensor<?xi32, #SparseVector>
		%0 = linalg.generic #trait_scale
		ins(%arga: tensor<?xf64, #SparseVector>)
		outs(%xv: tensor<?xi32, #SparseVector>) {
		^bb(%a: f64, %x: i32):
		%1 = sparse_tensor.unary %a : f64 to i32
		present={}
		absent={
		sparse_tensor.yield %ci1 : i32
		}
		linalg.yield %1 : i32
		} -> tensor<?xi32, #SparseVector>
		return %0 : tensor<?xi32, #SparseVector>
		}

		// Dumps a sparse vector of type f64.
func @dump(%arg0: tensor<?xf64, #SparseVector>) {		func @dump(%arg0: tensor<?xf64, #SparseVector>) {
// Dump the values array to verify only sparse contents are stored.		// Dump the values array to verify only sparse contents are stored.
%c0 = arith.constant 0 : index		%c0 = arith.constant 0 : index
%d0 = arith.constant -1.0 : f64		%d0 = arith.constant -1.0 : f64
%0 = sparse_tensor.values %arg0 : tensor<?xf64, #SparseVector> to memref<?xf64>		%0 = sparse_tensor.values %arg0 : tensor<?xf64, #SparseVector> to memref<?xf64>
%1 = vector.transfer_read %0[%c0], %d0: memref<?xf64>, vector<16xf64>		%1 = vector.transfer_read %0[%c0], %d0: memref<?xf64>, vector<16xf64>
vector.print %1 : vector<16xf64>		vector.print %1 : vector<16xf64>
// Dump the dense vector to verify structure is correct.		// Dump the dense vector to verify structure is correct.
%dv = sparse_tensor.convert %arg0 : tensor<?xf64, #SparseVector> to tensor<?xf64>		%dv = sparse_tensor.convert %arg0 : tensor<?xf64, #SparseVector> to tensor<?xf64>
%2 = bufferization.to_memref %dv : memref<?xf64>		%2 = bufferization.to_memref %dv : memref<?xf64>
%3 = vector.transfer_read %2[%c0], %d0: memref<?xf64>, vector<32xf64>		%3 = vector.transfer_read %2[%c0], %d0: memref<?xf64>, vector<32xf64>
vector.print %3 : vector<32xf64>		vector.print %3 : vector<32xf64>
memref.dealloc %2 : memref<?xf64>		memref.dealloc %2 : memref<?xf64>
return		return
}		}

		// Dumps a sparse vector of type i32.
		func @dumpi32(%arg0: tensor<?xi32, #SparseVector>) {
		// Dump the values array to verify only sparse contents are stored.
		%c0 = arith.constant 0 : index
		%d0 = arith.constant -1 : i32
		%0 = sparse_tensor.values %arg0 : tensor<?xi32, #SparseVector> to memref<?xi32>
		%1 = vector.transfer_read %0[%c0], %d0: memref<?xi32>, vector<24xi32>
		vector.print %1 : vector<24xi32>
		// Dump the dense vector to verify structure is correct.
		%dv = sparse_tensor.convert %arg0 : tensor<?xi32, #SparseVector> to tensor<?xi32>
		%2 = bufferization.to_memref %dv : memref<?xi32>
		%3 = vector.transfer_read %2[%c0], %d0: memref<?xi32>, vector<32xi32>
		vector.print %3 : vector<32xi32>
		memref.dealloc %2 : memref<?xi32>
		return
		}

// Driver method to call and verify vector kernels.		// Driver method to call and verify vector kernels.
func @entry() {		func @entry() {
%c0 = arith.constant 0 : index		%c0 = arith.constant 0 : index
%d1 = arith.constant 1.1 : f64		%d1 = arith.constant 1.1 : f64

// Setup sparse vectors.		// Setup sparse vectors.
%v1 = arith.constant sparse<		%v1 = arith.constant sparse<
[ [0], [3], [11], [17], [20], [21], [28], [29], [31] ],		[ [0], [3], [11], [17], [20], [21], [28], [29], [31] ],
Show All 14 Lines	func @entry() {
// Call sparse vector kernels.		// Call sparse vector kernels.
%0 = call @vector_scale(%sv1)		%0 = call @vector_scale(%sv1)
: (tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector>		: (tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector>
%1 = call @vector_scale_inplace(%sv1)		%1 = call @vector_scale_inplace(%sv1)
: (tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector>		: (tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector>
%2 = call @vector_add(%sv1, %sv2)		%2 = call @vector_add(%sv1, %sv2)
: (tensor<?xf64, #SparseVector>,		: (tensor<?xf64, #SparseVector>,
tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector>		tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector>
%3 = call @vector_mul(%sv1, %sv2)		%3 = call @vector_min(%sv1, %sv2)
		: (tensor<?xf64, #SparseVector>,
		tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector>
		%4 = call @vector_mul(%sv1, %sv2)
: (tensor<?xf64, #SparseVector>,		: (tensor<?xf64, #SparseVector>,
tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector>		tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector>
%4 = call @vector_mul_d(%sv1, %sv2)		%5 = call @vector_mul_d(%sv1, %sv2)
: (tensor<?xf64, #SparseVector>,		: (tensor<?xf64, #SparseVector>,
tensor<?xf64, #SparseVector>) -> tensor<?xf64, #DenseVector>		tensor<?xf64, #SparseVector>) -> tensor<?xf64, #DenseVector>
%5 = call @vector_dotprod(%sv1, %sv2, %x)		%6 = call @vector_dotprod(%sv1, %sv2, %x)
: (tensor<?xf64, #SparseVector>,		: (tensor<?xf64, #SparseVector>,
tensor<?xf64, #SparseVector>, tensor<f64>) -> tensor<f64>		tensor<?xf64, #SparseVector>, tensor<f64>) -> tensor<f64>
		%7 = call @vector_setdiff(%sv1, %sv2)
		: (tensor<?xf64, #SparseVector>,
		tensor<?xf64, #SparseVector>) -> tensor<?xf64, #SparseVector>
		%8 = call @vector_complement(%sv1)
		: (tensor<?xf64, #SparseVector>) -> tensor<?xi32, #SparseVector>

//		//
// Verify the results.		// Verify the results.
//		//
// CHECK: ( 2, 4, 6, 8, 10, 12, 14, 16, 18, -1, -1, -1, -1, -1, -1, -1 )		// CHECK: ( 2, 4, 6, 8, 10, 12, 14, 16, 18, -1, -1, -1, -1, -1, -1, -1 )
// CHECK-NEXT: ( 2, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 8, 0, 0, 10, 12, 0, 0, 0, 0, 0, 0, 14, 16, 0, 18 )		// CHECK-NEXT: ( 2, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 8, 0, 0, 10, 12, 0, 0, 0, 0, 0, 0, 14, 16, 0, 18 )
// CHECK-NEXT: ( 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, -1, -1, -1, -1, -1, -1 )		// CHECK-NEXT: ( 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, -1, -1, -1, -1, -1, -1 )
// CHECK-NEXT: ( 0, 11, 0, 12, 13, 0, 0, 0, 0, 0, 14, 0, 0, 0, 0, 0, 15, 0, 16, 0, 0, 17, 0, 0, 0, 0, 0, 0, 18, 19, 0, 20 )		// CHECK-NEXT: ( 0, 11, 0, 12, 13, 0, 0, 0, 0, 0, 14, 0, 0, 0, 0, 0, 15, 0, 16, 0, 0, 17, 0, 0, 0, 0, 0, 0, 18, 19, 0, 20 )
// CHECK-NEXT: ( 2, 4, 6, 8, 10, 12, 14, 16, 18, -1, -1, -1, -1, -1, -1, -1 )		// CHECK-NEXT: ( 2, 4, 6, 8, 10, 12, 14, 16, 18, -1, -1, -1, -1, -1, -1, -1 )
// CHECK-NEXT: ( 2, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 8, 0, 0, 10, 12, 0, 0, 0, 0, 0, 0, 14, 16, 0, 18 )		// CHECK-NEXT: ( 2, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 8, 0, 0, 10, 12, 0, 0, 0, 0, 0, 0, 14, 16, 0, 18 )
// CHECK-NEXT: ( 2, 4, 6, 8, 10, 12, 14, 16, 18, -1, -1, -1, -1, -1, -1, -1 )		// CHECK-NEXT: ( 2, 4, 6, 8, 10, 12, 14, 16, 18, -1, -1, -1, -1, -1, -1, -1 )
// CHECK-NEXT: ( 2, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 8, 0, 0, 10, 12, 0, 0, 0, 0, 0, 0, 14, 16, 0, 18 )		// CHECK-NEXT: ( 2, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 8, 0, 0, 10, 12, 0, 0, 0, 0, 0, 0, 14, 16, 0, 18 )
// CHECK-NEXT: ( 2, 11, 16, 13, 14, 6, 15, 8, 16, 10, 29, 32, 35, 38, -1, -1 )		// CHECK-NEXT: ( 2, 11, 16, 13, 14, 6, 15, 8, 16, 10, 29, 32, 35, 38, -1, -1 )
// CHECK-NEXT: ( 2, 11, 0, 16, 13, 0, 0, 0, 0, 0, 14, 6, 0, 0, 0, 0, 15, 8, 16, 0, 10, 29, 0, 0, 0, 0, 0, 0, 32, 35, 0, 38 )		// CHECK-NEXT: ( 2, 11, 0, 16, 13, 0, 0, 0, 0, 0, 14, 6, 0, 0, 0, 0, 15, 8, 16, 0, 10, 29, 0, 0, 0, 0, 0, 0, 32, 35, 0, 38 )
		// CHECK-NEXT: ( 2, 11, 4, 13, 14, 6, 15, 8, 16, 10, 12, 14, 16, 18, -1, -1 )
		// CHECK-NEXT: ( 2, 11, 0, 4, 13, 0, 0, 0, 0, 0, 14, 6, 0, 0, 0, 0, 15, 8, 16, 0, 10, 12, 0, 0, 0, 0, 0, 0, 14, 16, 0, 18 )
// CHECK-NEXT: ( 48, 204, 252, 304, 360, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 )		// CHECK-NEXT: ( 48, 204, 252, 304, 360, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 )
// CHECK-NEXT: ( 0, 0, 0, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 204, 0, 0, 0, 0, 0, 0, 252, 304, 0, 360 )		// CHECK-NEXT: ( 0, 0, 0, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 204, 0, 0, 0, 0, 0, 0, 252, 304, 0, 360 )
// CHECK-NEXT: ( 0, 0, 0, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 204, 0, 0, 0, 0, 0, 0, 252, 304, 0, 360 )		// CHECK-NEXT: ( 0, 0, 0, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 204, 0, 0, 0, 0, 0, 0, 252, 304, 0, 360 )
// CHECK-NEXT: 1169.1		// CHECK-NEXT: 1169.1
		// CHECK-NEXT: ( 2, 6, 8, 10, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 )
		// CHECK-NEXT: ( 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 8, 0, 0, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 )
		// CHECK-NEXT: ( 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1 )
		// CHECK-NEXT: ( 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0 )
//		//
call @dump(%sv1) : (tensor<?xf64, #SparseVector>) -> ()		call @dump(%sv1) : (tensor<?xf64, #SparseVector>) -> ()
call @dump(%sv2) : (tensor<?xf64, #SparseVector>) -> ()		call @dump(%sv2) : (tensor<?xf64, #SparseVector>) -> ()
call @dump(%0) : (tensor<?xf64, #SparseVector>) -> ()		call @dump(%0) : (tensor<?xf64, #SparseVector>) -> ()
call @dump(%1) : (tensor<?xf64, #SparseVector>) -> ()		call @dump(%1) : (tensor<?xf64, #SparseVector>) -> ()
call @dump(%2) : (tensor<?xf64, #SparseVector>) -> ()		call @dump(%2) : (tensor<?xf64, #SparseVector>) -> ()
call @dump(%3) : (tensor<?xf64, #SparseVector>) -> ()		call @dump(%3) : (tensor<?xf64, #SparseVector>) -> ()
%m4 = sparse_tensor.values %4 : tensor<?xf64, #DenseVector> to memref<?xf64>		call @dump(%4) : (tensor<?xf64, #SparseVector>) -> ()
%v4 = vector.load %m4[%c0]: memref<?xf64>, vector<32xf64>		%m5 = sparse_tensor.values %5 : tensor<?xf64, #DenseVector> to memref<?xf64>
vector.print %v4 : vector<32xf64>		%v5 = vector.load %m5[%c0]: memref<?xf64>, vector<32xf64>
%m5 = bufferization.to_memref %5 : memref<f64>		vector.print %v5 : vector<32xf64>
%v5 = memref.load %m5[] : memref<f64>		%m6 = bufferization.to_memref %6 : memref<f64>
vector.print %v5 : f64		%v6 = memref.load %m6[] : memref<f64>
		vector.print %v6 : f64
		call @dump(%7) : (tensor<?xf64, #SparseVector>) -> ()
		call @dumpi32(%8) : (tensor<?xi32, #SparseVector>) -> ()

// Release the resources.		// Release the resources.
sparse_tensor.release %sv1 : tensor<?xf64, #SparseVector>		sparse_tensor.release %sv1 : tensor<?xf64, #SparseVector>
sparse_tensor.release %sv2 : tensor<?xf64, #SparseVector>		sparse_tensor.release %sv2 : tensor<?xf64, #SparseVector>
sparse_tensor.release %0 : tensor<?xf64, #SparseVector>		sparse_tensor.release %0 : tensor<?xf64, #SparseVector>
sparse_tensor.release %2 : tensor<?xf64, #SparseVector>		sparse_tensor.release %2 : tensor<?xf64, #SparseVector>
sparse_tensor.release %3 : tensor<?xf64, #SparseVector>		sparse_tensor.release %3 : tensor<?xf64, #SparseVector>
sparse_tensor.release %4 : tensor<?xf64, #DenseVector>		sparse_tensor.release %4 : tensor<?xf64, #SparseVector>
		sparse_tensor.release %5 : tensor<?xf64, #DenseVector>
		sparse_tensor.release %7 : tensor<?xf64, #SparseVector>
		sparse_tensor.release %8 : tensor<?xi32, #SparseVector>
memref.dealloc %xdata : memref<f64>		memref.dealloc %xdata : memref<f64>
return		return
}		}
}		}

mlir/unittests/Dialect/SparseTensor/MergerTest.cpp

	Show First 20 Lines • Show All 217 Lines • ▼ Show 20 Lines
	/// lat( i_00 i_01 / (tensor_0 + tensor_1) )			/// lat( i_00 i_01 / (tensor_0 + tensor_1) )
	/// lat( i_00 / tensor_0 )			/// lat( i_00 / tensor_0 )
	/// }			/// }
	TEST_F(MergerTest3T1L, VectorAdd2) {			TEST_F(MergerTest3T1L, VectorAdd2) {
	// Construct expression.			// Construct expression.
	auto e = addf(tensor(t0), tensor(t1));			auto e = addf(tensor(t0), tensor(t1));

	// Build lattices and check.			// Build lattices and check.
	auto s = merger.buildLattices(e, l0);			auto s = merger.buildLattices(e, l0, 1);
	expectNumLatPoints(s, 3);			expectNumLatPoints(s, 3);
	expectLatPoint(s, lat(0), addfPattern(tensorPattern(t0), tensorPattern(t1)),			expectLatPoint(s, lat(0), addfPattern(tensorPattern(t0), tensorPattern(t1)),
	loopsToBits({{l0, t0}, {l0, t1}}));			loopsToBits({{l0, t0}, {l0, t1}}));
	expectLatPointWithinRange(s, lat(1), 2, tensorPattern(t0),			expectLatPointWithinRange(s, lat(1), 2, tensorPattern(t0),
	loopsToBits({{l0, t0}}));			loopsToBits({{l0, t0}}));
	expectLatPointWithinRange(s, lat(1), 2, tensorPattern(t1),			expectLatPointWithinRange(s, lat(1), 2, tensorPattern(t1),
	loopsToBits({{l0, t1}}));			loopsToBits({{l0, t1}}));

	Show All 14 Lines
	/// {			/// {
	/// lat( i_00 i_01 / (tensor_0 * tensor_1) )			/// lat( i_00 i_01 / (tensor_0 * tensor_1) )
	/// }			/// }
	TEST_F(MergerTest3T1L, VectorMul2) {			TEST_F(MergerTest3T1L, VectorMul2) {
	// Construct expression.			// Construct expression.
	auto e = mulf(t0, t1);			auto e = mulf(t0, t1);

	// Build lattices and check.			// Build lattices and check.
	auto s = merger.buildLattices(e, l0);			auto s = merger.buildLattices(e, l0, 1);
	expectNumLatPoints(s, 1);			expectNumLatPoints(s, 1);
	expectLatPoint(s, lat(0), mulfPattern(tensorPattern(t0), tensorPattern(t1)),			expectLatPoint(s, lat(0), mulfPattern(tensorPattern(t0), tensorPattern(t1)),
	loopsToBits({{l0, t0}, {l0, t1}}));			loopsToBits({{l0, t0}, {l0, t1}}));

	// Optimize lattices and check.			// Optimize lattices and check.
	s = merger.optimizeSet(s);			s = merger.optimizeSet(s);
	expectNumLatPoints(s, 1);			expectNumLatPoints(s, 1);
	expectLatPoint(s, lat(0), mulfPattern(tensorPattern(t0), tensorPattern(t1)),			expectLatPoint(s, lat(0), mulfPattern(tensorPattern(t0), tensorPattern(t1)),
	loopsToBits({{l0, t0}, {l0, t1}}));			loopsToBits({{l0, t0}, {l0, t1}}));
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse] Lowering for unary and binaryClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 423617

mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir

mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir

mlir/unittests/Dialect/SparseTensor/MergerTest.cpp

[mlir][sparse] Lowering for unary and binary
ClosedPublic