This is an archive of the discontinued LLVM Phabricator instance.

[ADT] Stop trying to test every combination of values in a triple in every permutation.
ClosedPublic

Authored by chandlerc on Aug 4 2016, 1:23 AM.

Details

Summary

I know we've talked about this before, and I'm pretty sure I've proposed
something similar before but I think at the time I had a less good
understanding of what was actually under test.

But fundamentally, the current approach isn't a long-term viable pattern. This
test keeps getting slower, despite my working very hard to make it get some
"optimizations" even in -O0 builds in order to lower the constant factors.
Fundamentally, we're doing an unreasonable amount of work. And its crept back
into my critical path, so I'm returning to the idea of changing the testing
strategy here.

Looking at the specific thing being tested -- the goal seems very
clearly to be testing the *permutations*, not the *combinations*. The
combinations are driving up the complexity much more than anything else.

Instead, test every possible value for a given triple entry in every
permutation of *some* triple. This really seems to cover the core goal
of the test. Every single possible triple component is tested in every
position. But because we keep the rest of the triple constant, it does
so in a dramatically more scalable amount of time.

For me on a debug build, this goes from running for 19 seconds to 19
milliseconds, or a 1000x improvement. This makes a world of difference
for the critical path of 'ninja check-llvm' and other extremely common
workflows.

Diff Detail

Repository
rL LLVM

Event Timeline

chandlerc updated this revision to Diff 66771.Aug 4 2016, 1:23 AM
chandlerc retitled this revision from to [ADT] Stop trying to test every combination of values in a triple in every permutation..
chandlerc updated this object.
chandlerc added a subscriber: llvm-commits.
dberris added a subscriber: dberris.Aug 4 2016, 1:29 AM

Do you need to use permutations?

So the original version tried every value in every slot against every other value in every slot.
Given, say, 4 slots each with M, N, O, and P possible values, that gives us M * N * O * P test cases.

  • but we can reasonably identify that the parsing of the value in a given slot is independent of the values in the other slots, so this now tests every value in every slot regardless of the values in other slots?
    • that gets us down to (M + N + O + P) * 4 tests (assuming 4 slots for simplicity)
  • would it be reasonable to assume that the parsing of a value is independent of which slot it is in? So we shouldn't then test that every value can appear in every slot/ordering?
    • that would get us down to M + N + O + P test cases

Essentially we want to test that the different entities can appear in any order (one test for each ordering, or 4!) then that each value for each entity can roundtrip (M + N + O + P), I think?

Do you need to use permutations?

So the original version tried every value in every slot against every other value in every slot.

More than that, see below:

Given, say, 4 slots each with M, N, O, and P possible values, that gives us M * N * O * P test cases.

So, the original version, ignoring the 3-slot special case and just focusing on the 4 slot case did:

(M * N * O * P) * 4!

  • but we can reasonably identify that the parsing of the value in a given slot is independent of the values in the other slots, so this now tests every value in every slot regardless of the values in other slots?
    • that gets us down to (M + N + O + P) * 4 tests (assuming 4 slots for simplicity)

Correct in general. IT gets us down to: (M + N + O + P) * 4! tests.

  • would it be reasonable to assume that the parsing of a value is independent of which slot it is in? So we shouldn't then test that every value can appear in every slot/ordering?

It might, but that just saves 4! from this and wouldn't catch stateful bugs in the parsing. So I'm pretty happy to try and parse each entry in each position.

Mostly this is because the runtime is now so fast that it seems to not matter.

Does that make sense?

  • would it be reasonable to assume that the parsing of a value is independent of which slot it is in? So we shouldn't then test that every value can appear in every slot/ordering?

It might, but that just saves 4! from this and wouldn't catch stateful bugs in the parsing. So I'm pretty happy to try and parse each entry in each position.

Mostly this is because the runtime is now so fast that it seems to not matter.

Does that make sense?

I think that makes sense, although catching state-related bugs might be possible through a different test -- i.e. specifically testing boundary conditions for when a buggy implementation might create a wrongly-normalised triple.

Although I do agree that this is better than the current state.

Mind updating the description with some of the computations, to make it clearer what kinds of savings (in terms of test cases) and what kinds of trade-offs are being made? More specifically I suggest re-wording the parts about combinations and permutations, and using the formulae instead (even if they're hand-wavy and approximations).

unittests/ADT/TripleTest.cpp
306–310 ↗(On Diff #66771)

I think in-lining the computation in the reduction of the number of tests here would be useful (instead of just having it in the description). For example, it's useful to know that if we do the Cartesian Product and permute each of the tuples in that space, we get the following number of test-cases:

X = (|Arch| * |Vendor| * |OS| * |Env|) * 4!

This patch cuts it down to:

Y = (|Arch| + |Vendor| + |OS| + |Env|) * 4! * 4

That's the difference between a geometric series and an arithmetic series -- i.e. if any one of the sets grows by N, that X grows exponentially compared to linearly to Y.

dberris added inline comments.Aug 4 2016, 6:10 PM
unittests/ADT/TripleTest.cpp
306–310 ↗(On Diff #66771)

That's the difference between a geometric series and an arithmetic series -- i.e. if any one of the sets grows by N, that X grows exponentially compared to linearly to Y.

Actually, scratch this part (I didn't know why I was thinking about it as a geometric series). It's the difference between N^4 and N where N is the cardinality of the largest set.

rengolin edited edge metadata.EditedAug 5 2016, 6:51 AM

I'm still looking at the code and the comments, but wouldn't it be enough for 99% of the cases to test it against "unknown" on all other fields? I mean, there are a few problematic combinations, but they require specialised knowledge, and neither the old test (nor the new one) are checking them.

So, I'd expect something like:

for (Arch) {
  E = join (Arch, "unknown", "unknown", "unknown");
  do {
    EXPECT_EQ(E, Triple::normalize(Join(C[I[0]], C[I[1]], C[I[2]])));
  } while (std::next_permutation(std::begin(I), std::end(I)));
}
for (Vendor) {
  E = join ("unknown", Vendor, "unknown", "unknown");
  ...

to catch most silly errors. If none of those are caught, than we move into more advanced testing, via specialised knowledge, which we already do below, ex:

EXPECT_EQ("i386--windows-gnu", Triple::normalize("i386-mingw32"));

But maybe also some classes of problems, which would require some more permutation analysis, but only a fraction of the original problem.

I'm still looking at the code and the comments, but wouldn't it be enough for 99% of the cases to test it against "unknown" on all other fields? I mean, there are a few problematic combinations, but they require specialised knowledge, and neither the old test (nor the new one) are checking them.

I think that's exactly what the patch does?

I'm still looking at the code and the comments, but wouldn't it be enough for 99% of the cases to test it against "unknown" on all other fields? I mean, there are a few problematic combinations, but they require specialised knowledge, and neither the old test (nor the new one) are checking them.

I think that's exactly what the patch does?

Dang! That's what I get for commenting before really reading the patch... :)

rengolin accepted this revision.Aug 5 2016, 8:45 AM
rengolin edited edge metadata.

Gosh, now that I looked at it, this is *exactly* what it does.

Some people talking about (A+V+O+E)*4! got me side tracked. Apologies.

LGTM! :)

This revision is now accepted and ready to land.Aug 5 2016, 8:45 AM
chandlerc marked 2 inline comments as done.Aug 5 2016, 11:06 PM

Improved comments and description. Thanks all, landing!

With this, the slowest test in check-llvm is 20s and the next slowest is 11s!

This revision was automatically updated to reflect the committed changes.