This is an archive of the discontinued LLVM Phabricator instance.

[llvm] [lit] Support forcing lexical test order
ClosedPublic

Authored by mgorny on Aug 7 2021, 7:42 AM.

Details

Summary

Add a new --order option to choose between available test orders:
the default "smart" order, predictable "lexical" order or "random"
order. Default to using lexical order and one job in the lit test
suite.

Diff Detail

Event Timeline

mgorny created this revision.Aug 7 2021, 7:42 AM
mgorny requested review of this revision.Aug 7 2021, 7:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 7 2021, 7:42 AM

Thanks for working on this!

The patch doesn't spell it consistently: sometimes --lexical and sometimes --lexical-order.

Shouldn't we put --lexical-order directly into lit's test suite's %{lit} substitution? That way, if another of lit's current or future tests is susceptible to this problem, no one has to waste time trying to reproduce the race and figure out what's happening. Blindly using %{lit} in lit's test suite would just work.

As far as I know, reorder.py is the only test in lit's test suite that requires the reordering. Either it can use a %{lit-reordered} substitution that drops --lexical-order, or it can use a new command-line option that selects the default order and overrides any previous --lexical-order or --shuffle.

llvm/utils/lit/tests/shtest-shell.py
10–11

Any reason to keep this? It might be worthwhile to grep for lit_test_times, if you didn't already.

mgorny added a comment.Aug 7 2021, 9:25 AM

Thanks for working on this!

The patch doesn't spell it consistently: sometimes --lexical and sometimes --lexical-order.

Sorry, I've forgotten how I named it :-D.

Shouldn't we put --lexical-order directly into lit's test suite's %{lit} substitution? That way, if another of lit's current or future tests is susceptible to this problem, no one has to waste time trying to reproduce the race and figure out what's happening. Blindly using %{lit} in lit's test suite would just work.

I don't mind that. However, should we also put -j 1 then?

As far as I know, reorder.py is the only test in lit's test suite that requires the reordering. Either it can use a %{lit-reordered} substitution that drops --lexical-order, or it can use a new command-line option that selects the default order and overrides any previous --lexical-order or --shuffle.

I suppose I could do that, and then replace the mutually exclusive group approach with 'last option wins'.

mgorny marked an inline comment as done.Aug 7 2021, 12:34 PM
mgorny added inline comments.
llvm/utils/lit/tests/shtest-shell.py
10–11

I was grepping for order FIXMEs ;-). Fixed now.

mgorny updated this revision to Diff 364976.Aug 7 2021, 12:41 PM
mgorny marked an inline comment as done.
mgorny edited the summary of this revision. (Show Details)

Add --default-order. Last option specified takes precedence. Include --lexical-order in %{lit} substitution.

Thanks for continuing to work on all this.

Shouldn't we put --lexical-order directly into lit's test suite's %{lit} substitution? That way, if another of lit's current or future tests is susceptible to this problem, no one has to waste time trying to reproduce the race and figure out what's happening. Blindly using %{lit} in lit's test suite would just work.

I don't mind that. However, should we also put -j 1 then?

Good point. I haven't looked to see whether the default value of -j is ever expected in lit's test suite. Anyway, maybe that should be a different patch to make reviewing/debugging easier.

Also, the docs should be updated: https://llvm.org/docs/CommandGuide/lit.html

Please don't forget this.

llvm/utils/lit/lit/cl_arguments.py
158

Instead of --default-order, I'd prefer something that describes the order. I'm not sure of the best name, especially given that it's always possible we'll think of other tweaks to the default order in the future. Perhaps something like --auto-order or --smart-order to at least suggest it's something clever. What do you think?

llvm/utils/lit/tests/lit.cfg
65–67

I think it would be helpful to extend the above comments to explain why --lexical-order is needed.

llvm/utils/lit/tests/reorder.py
4

As far as I can tell, we now have no test that --default-order is actually the default. Sorry, I didn't think of that when I made that suggestion. If we ever broke that behavior (for example, someone could reorganize the arg specification in a subtly broken way), I wonder how long it would be before anyone realized that test suites started taking much longer than necessary.

What if we have something like a %{lit-no-order-opt} that doesn't specify any order options? reorder.py could then run lit twice, once to ensure --default-order is indeed the default, and once to ensure it overrides --lexical-order and --shuffle when specified afterward.

I guess that means --default-order isn't so useful in lit's own test suite. It's just another feature to test. However, as I mentioned in D107427, I'd like to use it in clang/test/utils too for testing test-suite-generation scripts.

jdenny added inline comments.Aug 9 2021, 10:14 AM
llvm/utils/lit/tests/reorder.py
4

I guess that means --default-order isn't so useful in lit's own test suite. It's just another feature to test. However, as I mentioned in D107427, I'd like to use it in clang/test/utils too for testing test-suite-generation scripts.

Well, that actually makes no sense. I want --lexical-order there not --default-order.

Hmm. I'm not opposed to keeping something like --default-order for completeness, but I'm now having trouble seeing a real use case. What do others think?

mgorny added inline comments.Aug 9 2021, 10:23 AM
llvm/utils/lit/tests/reorder.py
4

To be honest, I don't really see a use case for testing which order is the default. After all, setting the default order is a matter of changing the parameter to default=… A test for this sounds like asking people to change the default in two places.

jdenny added inline comments.Aug 10 2021, 8:07 AM
llvm/utils/lit/tests/reorder.py
4

I don't agree with the argument that a behavior whose current implementation looks simple doesn't need a regression test, and I've previously found that the apparent simplicity of an argparse config can be deceptive. If we're going to implement these options, then I think we should test them fully, especially that the intended default behavior is actually the default. If the default ever changes accidentally, it could quietly affect many people. Moreover, testing it is straight-forward: we just need another lit substitution to use in reorder.py.

After further thought, I do think having a different option for each possible order is likely to prove useful, so I vote we keep --default-order (or whatever we end up calling it).

yln added a comment.Aug 11 2021, 3:07 PM

Thanks for working on this! :)

llvm/utils/lit/lit/cl_arguments.py
14

While we are at it, can we also give DEFAULT a more descriptive name here?

Order predicate: (not t.previous_failure, -t.previous_elapsed, t.getFullName())

  1. Failures first
  2. Longest tests first
  3. Then by name

Since putting all of the above in an enum name is quite a mouthful, I am advocating for smart as suggested by Joel.

Together with my other comment (see below) the help text could then be:
--order=(smart|aaa|bbb|...) [default: smart]

This would align the names used for options and names in the code.

158

I would also prefer descriptive names instead of "default".

If we already put the effort i to improve ordering, how about: --order=(aaa|bbb|...) and make the old ones aliases for the new ones: --shuffle --> --order=random.

I think argparse spells out all possible options and even shows which one is the default when printing the help: --order=(smart|aaa|bbb|...) [default: smart]

So it shows the user the available options, which option is the default, and the explanation of these options is right next to it.
Having all options documented right next to each other also improves discoverability and makes contrasting the differences between options easier.

yln added inline comments.Aug 11 2021, 3:11 PM
llvm/utils/lit/lit/cl_arguments.py
158

If it wasn't clear, I am advocating for using parser.add_argument('--order', choices=['smart', aaa', 'bbb']) here.
https://docs.python.org/3/library/argparse.html#choices

mgorny marked 8 inline comments as done.Aug 12 2021, 1:12 AM
mgorny updated this revision to Diff 365930.Aug 12 2021, 1:15 AM
mgorny edited the summary of this revision. (Show Details)

Switched to choice-based --order= option. Renamed 'default' to 'smart'. Added a new lit subst not to include order args. Made %{lit} include -j1 as well. Added the comment.

I think this addresses all the comments.

I have also removed the enum since it seemed to make little sense now... but I have a new idea, so I might reintroduce it in a minute ;-).

mgorny updated this revision to Diff 365932.Aug 12 2021, 1:28 AM

Restore the TestOrder enum. It doesn't change much logically but gives some abstraction that'll hopefully make uncaught typos less likely in the future.

--order is nice.

Also, the docs should be updated: https://llvm.org/docs/CommandGuide/lit.html

This is still needed.

mgorny updated this revision to Diff 367411.Aug 19 2021, 12:36 AM

Updated docs.

jdenny accepted this revision.Aug 27 2021, 11:28 AM

LGTM. Thanks.

This revision is now accepted and ready to land.Aug 27 2021, 11:28 AM
This revision was automatically updated to reflect the committed changes.
MatzeB added a subscriber: MatzeB.Jun 28 2022, 3:45 PM

Hey, I don't know if this has anything to do with this change.

But I see the "llvm-clang-x86_64-sie-win" buildbot (https://lab.llvm.org/buildbot/#/builders/216) failing the reorder.py testcase randomly every 10-20 builds.

See https://lab.llvm.org/buildbot/#/builders/216 for the list of recent builds.

I didn't look deeply into the root cause here, but thought it would be good to ping here in case anyone has an idea to get the bot more stable.

Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2022, 3:45 PM
dyung added a subscriber: dyung.Jun 29 2022, 5:10 PM

Hey, I don't know if this has anything to do with this change.

But I see the "llvm-clang-x86_64-sie-win" buildbot (https://lab.llvm.org/buildbot/#/builders/216) failing the reorder.py testcase randomly every 10-20 builds.

See https://lab.llvm.org/buildbot/#/builders/216 for the list of recent builds.

I didn't look deeply into the root cause here, but thought it would be good to ping here in case anyone has an idea to get the bot more stable.

I'm the owner of that buildbot, but unfortunately have not been able to get the failure to reproduce outside of the buildbot initiated build environment, so I haven't been able to really look into the issue. Any ideas how to possibly debug and/or fix the issue would be greatly appreciated as I've been a bit stuck on how to get it more stable!

ychen added a subscriber: ychen.Jun 29 2022, 5:16 PM

Hey, I don't know if this has anything to do with this change.

But I see the "llvm-clang-x86_64-sie-win" buildbot (https://lab.llvm.org/buildbot/#/builders/216) failing the reorder.py testcase randomly every 10-20 builds.

See https://lab.llvm.org/buildbot/#/builders/216 for the list of recent builds.

I didn't look deeply into the root cause here, but thought it would be good to ping here in case anyone has an idea to get the bot more stable.

I'm the owner of that buildbot, but unfortunately have not been able to get the failure to reproduce outside of the buildbot initiated build environment, so I haven't been able to really look into the issue. Any ideas how to possibly debug and/or fix the issue would be greatly appreciated as I've been a bit stuck on how to get it more stable!

It's kind of concerning that the test (reorder.py) writes to the source-controlled input file. I'm not sure if using a temporary directory to save intermediate files (like most other tests do) would fix the issue.
Alternatively, as the last resort, I'd suggest temporarily inserting some extra RUN lines in the test to help debug the random failure.