We opt to use unit tests rather than check tests as the lattice/merger code is a small C++ component with a well-defined API. Testing this API via check tests would be far less direct and readable. In addition, as the check tests will only be able to test the API indirectly, the tests may break based on unrelated changes; e.g. changes in linalg.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
---|---|---|
10–11 | Links like that may go stale in the (far?) future. I think it would be more illustrative to just show the "tensor index expression" in human readable format in the comments | |
26 | this is a good start, but we probably also want to make sure the lattices have the expected structure |
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
---|---|---|
25 | how about a before opt after opt check as well? |
Can you expand in the commit description why this is more suitable as a unit-tests and can't be tested on IR change (folding on simple sparse operations for example that would exercises all theses cases?).
I'll take this, since I was suggesting this to Gus as part of the refactoring :-)
To me, unit testing classes with a well-defined API for a large variety of expected input/output scenarios (and the lattice code falls into that category) seems much more preferable over setting up elaborate IR in an attempt to test those scenarios?
But, made curious by by your feedback, reading our testing guide, I indeed see
Unit tests are written using Google Test and are located in the unittests/ directory. Tests of these form should be limited to API tests that cannot be reasonably written as Check tests, e.g. those for data structures. It is important to keep in mind that the C++ APIs are not stable, and evolve over time. As such, directly testing the C++ IR interfaces makes the tests more fragile as those C++ APIs evolve over time. This makes future API refactorings, which may happen frequently, much more cumbersome as the number of tests scale.
I suppose this is the first time I categorically disagree with the llvm guidelines ;-)
But, even despite my objections, in this case, I suppose we could make a good case for unit testing, since testing through checks would require testing the generated code rather than expected lattices (so hunting for while-loops and for-loops of a particular form, and with a particular body) rather than simply checking for expected lattices with tensor subexpressions. I will work with Gus on putting a rationale in the commit message.
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
---|---|---|
33 | isn't this a bit too strict? isn't it possible to have two nodes for the same tensor? so tensorExp0.e0 == tensorExpr1.e0 seems a better way of comparing two tensor exprs | |
92–107 | This seems a very elaborate way of hunting for a particular subexpression Let's brainstorm offline a bit on how to make these tests more concise |
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
---|---|---|
74 | It seems all tests will consist of building a tensor expression, followed by testing if the computed lattices are as expected. I suggest to use a test fixture (subclass ::testing::Test and put all shared logic to setup and test all unit tests there, then use TEST_F instead of TEST). Also, I would like to make the tests readable at quick glance, so something like EXPECT_THAT(lattice(0), "tensor_0 + tensor_1") a quick and dirty way would be string based, so either (1) add a method to the fixture that compares while parsing the expected string, or or, we would add some intuitive methods to the fixture to make the expected more readable, EXPECT_THAT(lattice(0), addf(tensor(0), tensor(1)) The short methods have all the ugly details hidden inside the fixture (use of merger), but give very concise testing Then finally, we need a way to make it less sensitive to a particular order, but while still making sure the required order is preserved. Lastly, let's start with unit tests for simple cases, e.g. 1 or 2 tensors only, and then later add the more elaborate ones. |
Rebase
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
---|---|---|
33 | Yes, I wanted to ask you about this! I didn't fully understand this when I originally saw it. Do some tensor expressions point to themselves? That is, let's say expression with id 0 has Kind kTensor; will this tensor have e0 = 0? | |
92–107 | Re: (1), I agree, I would like a way to make it more readable! Re: (2), I didn't realize order was important, my mistake. That should simplify things considerably! |
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
---|---|---|
33 | Ah, I see the confusion. Note that perhaps we should make this a union. For binary ops, the e0/e1 are indices to other subexpressions. |
- Simplify tests (test 2-arg add and mul, rather than 4-arg)
- Use a test fixture, move logic into test fixture class
- Make tests a bit more readable (though still needs work)
- The main change here is that we now use helper functions to check that lattice point expressions and bits equal what we expect them to equal. Those helpers are designed to take a more readable set of arguments, e.g. a prefix-notation expression to test against, in the case of the expression checker.
- Check for a specific ordering of lattice points (this may break in the future!)
- If we were really thorough, we'd check for a partial ordering on the lattice points. However, we just check for an absolute ordering.
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
---|---|---|
33 | Fixed by the union refactor | |
74 | I think I have addressed all points except this one:
This I could imagine doing by building a function which, given two lattice points, checks for the given ordering between the lattice points. We'll need an ergonomic way to "find" a lattice point, e.g. by its expression. Once we "find" all lattice points, we can then do pairwise checking between the points, to make sure they are partially ordered correctly. | |
92–107 | Removed this function. See comment above re: checking partial ordering. |
mlir/unittests/Dialect/SparseTensor/CMakeLists.txt | ||
---|---|---|
2 | I believe you also have to cd .. and add add_subdirectory(SparseTensor) to the CMakeList.txt at the one-higher level. | |
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
15 | How about having a generic base fixture class MergerTest that takes these two parameters in the constructor, but specific fixture classes for #tensors and #loops, just so that we can later add tests for other situations by just instantiating a new case with a a few lines of code, e.g. MergerTest48 : MergerTest(4,8), MergerTest11 : MergerTest(1,1) etc. with some tensor setup code inside the constructor (perhaps better name than this but you get the idea)? | |
79–80 | I should have put this in the getter comments, but in general holding on to the const ref for too long can have very subtle bugs. Here it is okay since you don't insert anything but using merger.exp(e0).children.e0 inline is preferred in general | |
81–83 | no braces {} for single line if | |
91–96 | replace by just default: ? | |
126 | I find these comments a bit distracting. so that this just reads as lat(0) instead of /*lat point*/ 0 Also, this makes the test a bit implementation specific (ie. a complete order is tested, but we should really just expect a partial order on the output). Do you have any good ideas on how to deal with that? |
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
---|---|---|
30 | Wait, by the way this is used below, should this not add a full tensor node? Also, there is something impure about using the same merger for testing if an expression exists. |
Address all of Aart's comments except for partial ordering
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
---|---|---|
15 | Let me know if the way I did this makes sense! | |
30 |
You're totally right -- fixed now. This came from me misunderstanding how the expressions worked in the Merger (back when I thought tensornum == expression id)
Let me know if the new pattern approach works for you. | |
79–80 | Alright, I'll leave it if it's okay here, but I'll keep that in mind for the future! | |
79–80 | Good to know -- I'll leave this here if it's okay, but I'll keep this in mind for the future. | |
91–96 | For some reason I've always been very paranoid about bugs arising from adding new enum variants and having those variants be handled by a "default", rather than throwing a compile-time error. I'm happy to use default here if you don't think it'll be an issue! | |
126 | Made it a bit more readable I think, removed the comments. Let me know. Working on partial ordering now. |
almost there!
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
---|---|---|
46–50 | I would save the vertical space here, and just put the comments "against" the constructor, even without the leading empty /// | |
203 | actually, one last subtle point, note that since we have a(i) = b(i) + c(i) we really have *three* tensors in this example, with the third one (a), unused but dense This will matter when we start testing stuff like a(i) = a(i) + ..... and also avoids dumping tensor_out for input tensors during debugging. So please add one etra tensor in the constructor, and mark output as dense |
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
---|---|---|
203 | yes, we optimized a(i) = a(i) away before, now fixed ;-) |
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
---|---|---|
12–15 | one last nit, can we do without this, just to keep boilerplate down? |
mlir/unittests/Dialect/SparseTensor/MergerTest.cpp | ||
---|---|---|
12–15 | Done! |
Hey, sorry I had to revert to unbreak gcc 5 builds: https://buildkite.com/mlir/mlir-core/builds/15063#d35a9e3f-da90-4e84-86be-c69cb67b2b5c
I believe you also have to cd .. and add
add_subdirectory(SparseTensor)
to the CMakeList.txt at the one-higher level.