This is an archive of the discontinued LLVM Phabricator instance.

[lit] Implement support of per test timeout in lit.
ClosedPublic

Authored by delcypher on Nov 16 2015, 7:53 AM.

Details

Summary

[lit] Implement support of per test timeout in lit.

This should work with ShTest (executed externally or internally) and GTest
test formats.

To set the timeout a new option `--timeout=` has
been added which specifies the maximum run time of an individual test
in seconds. By default this 0 which causes no timeout to be enforced.

The timeout can also be set from a lit configuration file by modifying
the `lit_config.maxIndividualTestTime` property.

To implement a timeout we now require the psutil Python module if a
timeout is requested. This dependency is confined to the newly added
`lit.util.killProcessAndChildren()`. A note has been added into the
TODO document describing how we can remove the dependency on the
`pustil` module in the future. It would be nice to remove this
immediately but that is a lot more work and Daniel Dunbar believes it is
better that we get a working implementation first and then improve it.

To avoid breaking the existing behaviour the psutil module will not be
imported if no timeout is requested.

The included testcases are derived from test cases provided by
Jonathan Roelofs which were in an previous attempt to add a per test
timeout to lit (http://reviews.llvm.org/D6584). Thanks Jonathan!

Diff Detail

Event Timeline

delcypher updated this revision to Diff 40289.Nov 16 2015, 7:53 AM
delcypher retitled this revision from to [lit] Implement support of per test timeout in lit..
delcypher updated this object.
delcypher added a subscriber: llvm-commits.
jroelofs added inline comments.Nov 16 2015, 8:14 AM
utils/lit/lit/main.py
208

Please name the flag "--timeout" so it's easier to type out on the llvm-lit command line. It's probably a good idea to keep the descriptive variable name though.

utils/lit/tests/Inputs/per_test_timeout/slow.py
8 ↗(On Diff #40289)

The infinite loop testcase was actually useful to guarantee that the timeout detection & handling code isn't broken. With this testcase, it could be broken and we wouldn't notice (apart from the tests running a little slower).

delcypher added inline comments.Nov 16 2015, 8:44 AM
utils/lit/tests/Inputs/per_test_timeout/slow.py
8 ↗(On Diff #40289)

If the `slow.py program doesn't cause a timeout then several tests will fail. If you replace time.sleep(6) with time.sleep(0)` then run lit on the test cases you will see failures.

python2 ~/dev/llvm-upstream/src/utils/lit/lit.py -v --debug  tests/Inputs/per_test_timeout/
********************
Testing Time: 0.41s
********************
Failing Tests (3):
    per_test_timeout :: timeout_external.txt
    per_test_timeout :: timeout_internal.txt
    per_test_timeout :: timeout_internal_set_in_config.txt

  Expected Passes    : 2
  Unexpected Failures: 3

What you had before (an infinite loop and an `XFAIL line) was not helpful because running the above command would **always** hang. One work around if you really want a test case with an infinite loop is to do some trickery with lit.cfg so that when we invoke it from the command line slow.py` doesn't get detected but when we run lit inside lit it is detected.

jroelofs added inline comments.Nov 16 2015, 8:51 AM
utils/lit/tests/Inputs/per_test_timeout/slow.py
8 ↗(On Diff #40289)

I don't understand why "trickery" is needed in order to make the infinite loop testcase work. The test worked in the diff I uploaded to your llvm-dev thread about this...

delcypher updated this revision to Diff 40300.Nov 16 2015, 9:18 AM
  • Rename `--max-individual-test-time to --timeout` as requested by Jonathan Roelofs
  • Add infinite loop test case as requested by Jonathan Roelofs
delcypher marked an inline comment as done.Nov 16 2015, 9:26 AM
delcypher added inline comments.
utils/lit/tests/Inputs/per_test_timeout/slow.py
8 ↗(On Diff #40300)

If I did lit.py tests/Inputs/per_test_timeout/ lit would try to run the .py scripts (in addition to the .txt) which would hang.

I've changed lit.cfg just to run the .txt scripts. I was mistaken about needing lit.cfg trickery here.
I thought that would also prevent the .py scripts from being in the .txt test cases that invoke lit (which is why I mentioned the lit.cfg trickery) but if you manually specify the test case names it doesn't seem to matter that .py isn't in config.suffixes`.

The latest patch I just uploaded should address this.

jroelofs added inline comments.Nov 16 2015, 10:04 AM
utils/lit/tests/Inputs/per_test_timeout/lit.cfg
26 ↗(On Diff #40300)

Please revert this part of the change.

utils/lit/tests/Inputs/per_test_timeout/slow.py
8 ↗(On Diff #40300)

Ohh, I see what you were running into now. I don't think we have to worry about running lit on that particular test directory without the requisite --timeout flag.

I've changed lit.cfg just to run the .txt scripts.

Running lit.py tests/Inputs/per_test_timeout/ should hang, but running lit.py tests/Inputs/per_test_timeout/ --timeout=1 should not.

delcypher added inline comments.Nov 16 2015, 10:32 AM
utils/lit/tests/Inputs/per_test_timeout/slow.py
8 ↗(On Diff #40300)

If this is the case why did the patch you sent to post with the subject [lit] RFC: Per test timeout have timeout.py invoke lit inside lit with the --timeout flag? If you expect the caller to use --timeout= then having an additional test case that invokes lit on the other test cases with that flag seems redundant.

In my current implementation the "lit invoking lit" (timeout_*.txt tests) tests are not redundant because they invoke lit in three different ways

  • Use internal shell, set timeout on command line
  • Use external shell, set timeout on command line
  • Use internal shell, set timeout in configuration file (i.e. don't use command line flag)

I can revert the small change I made to lit.py but this begs the question

  • Who is running these tests?
  • How does the person running these tests know how to run them correctly and interpret their results?
jroelofs added inline comments.Nov 16 2015, 11:47 AM
utils/lit/tests/Inputs/per_test_timeout/slow.py
8 ↗(On Diff #40300)

If this is the case why did the patch you sent to post with the subject [lit] RFC: Per test timeout have timeout.py invoke lit inside lit with the --timeout flag? If you expect the caller to use --timeout= then having an additional test case that invokes lit on the other test cases with that flag seems redundant.

Because that's how you're supposed to test lit features? Each of the python files directly in path/to/llvm/utils/lit/tests is itself a testcase for lit (some (all?) of which have their own testsuite in the Inputs folder), and that's where I expect the caller to know to have the --timeout= arg. Let me know if I can clarify this any better... it's a little hard to explain.

In my current implementation the "lit invoking lit" (timeout_*.txt tests) tests are not redundant because they invoke lit in three different ways

Use internal shell, set timeout on command line
Use external shell, set timeout on command line
Use internal shell, set timeout in configuration file (i.e. don't use command line flag)

Each of those three ways of invoking lit to test this feature should be its own file in path/to/llvm/utils/lit/tests... i.e. move them up one directory.

Who is running these tests?

People who make changes to lit should be running them. It'd be ideal if they got run as part of llvm's make check, but that isn't the case currently. I've been meaning to put in a patch for this, so thanks for reminding me.

How does the person running these tests know how to run them correctly and interpret their results?

The correct way to run the lit testsuite for lit, is to run lit.py path/to/llvm/utils/lit/tests. The results are inretpreted by the top-level testsuite's tests, and if lit is not broken, they should all pass.

delcypher added inline comments.Nov 16 2015, 12:02 PM
utils/lit/tests/Inputs/per_test_timeout/slow.py
8 ↗(On Diff #40300)

Thanks for clarifying. I had completely misunderstood how the tests were meant to be run (and consequently how they were meant to be written).

I'll try to rewrite the tests.

jroelofs added inline comments.Nov 16 2015, 12:20 PM
utils/lit/tests/Inputs/per_test_timeout/slow.py
8 ↗(On Diff #40300)

Ok, cool... glad we're on the same page now.

One thing I just thought of: you'll need to set up the top-level tests such that they detect whether import psutil fails, so they can decide whether to test --timeout= at all. I'm not sure what the best way to do that would be, but here's an idea to start from: set an 'available feature' based on the presence of psutil, and then add REQUIRES: python-psutil in them.

delcypher updated this revision to Diff 40345.Nov 16 2015, 2:39 PM
  • Rewrite tests. Tests now include testing of a per test timeout for GTest tests.
delcypher marked an inline comment as done.Nov 16 2015, 2:41 PM
delcypher added inline comments.
utils/lit/tests/Inputs/per_test_timeout/slow.py
8 ↗(On Diff #40300)

Please see the latest revision (Diff 3). I've rewritten the tests and added some new ones too.

jroelofs added inline comments.Nov 16 2015, 2:57 PM
utils/lit/lit/main.py
478

s/timeouts/Timeouts/

utils/lit/tests/Inputs/googletest-timeout/DummySubDir/OneTest
32

copy-pasta? s/subtest B/subtest C/

utils/lit/tests/Inputs/per_test_timeout/slow.py
8 ↗(On Diff #40300)

Cool, thanks!

utils/lit/tests/lit.cfg
53

I'd make a note that --timeout functionality will be disabled too.

utils/lit/tests/shtest-timeout.py
13

I think you only need one set of CHECK lines after the three sets of RUN lines. Since the only part that's different is the ERR output, only those checks need to be duplicated/specialized.

delcypher marked an inline comment as done.Nov 16 2015, 3:31 PM
delcypher added inline comments.
utils/lit/lit/main.py
478

Will fix.

utils/lit/tests/Inputs/googletest-timeout/DummySubDir/OneTest
32

Oops yes that's a mistake.

utils/lit/tests/lit.cfg
53

I'll make a note of it.

utils/lit/tests/shtest-timeout.py
13

That's a good idea. I'll do that.

delcypher updated this revision to Diff 40355.Nov 16 2015, 3:33 PM
  • Fix some typos
  • Note that --timeout will be disabled in the warning emitted during testing if psutil can't be found
  • Add TIMEOUT tests to the list of failed tests that are listed in main.py
delcypher marked 8 inline comments as done.Nov 16 2015, 3:35 PM

Hopefully that's most of the testing issues fixed.

jroelofs accepted this revision.Nov 16 2015, 3:36 PM
jroelofs edited edge metadata.

LGTM with one more fix:

utils/lit/tests/shtest-timeout.py
42

It'd be better to just put them here, and delete this comment (as well as the other one).

This revision is now accepted and ready to land.Nov 16 2015, 3:36 PM
delcypher updated this revision to Diff 40356.Nov 16 2015, 3:41 PM
delcypher edited edge metadata.
  • Minor re-order of FileCheck CHECK directives
delcypher marked an inline comment as done.Nov 16 2015, 3:41 PM
delcypher added inline comments.
utils/lit/tests/shtest-timeout.py
41

Done

delcypher updated this object.Nov 16 2015, 3:43 PM
delcypher updated this revision to Diff 40831.Nov 20 2015, 2:32 PM
delcypher updated this object.
delcypher marked an inline comment as done.
  • Made lit output more useful (when -v is passed) when a timeout occurs, the exit code, stdout and stderr are now shown. If using the internal shell it also indicated which command(s) were killed by the timeout.
  • Improve tests to check for the improved output from lit when a timeout occurs.

Looks good.

utils/lit/lit/TestRunner.py
582

Oh cool, I've never noticed Phabricator highlighting rebase changes before.

delcypher added inline comments.Nov 21 2015, 2:25 AM
utils/lit/lit/TestRunner.py
582

I don't see anything different. Maybe that's because you added a comment?

That change (r253556) was a bug I found in lit while working on this patch. I committed that fix directly to trunk
because I thought it was trivial enough to not need reviewing.

MatzeB edited edge metadata.Dec 4 2015, 8:55 PM

This is a good feature to have.

I think this implementation will not apply a timeout to the TestRunner.py:executeShCmd codepath (which seems to be a /bin/sh emulation for windows users).
The requirement of an external python package is unfortunate, what about things possible with built-in python libraries such as "os.kill(pid, signal.SIGKILL)" or "resource.setrlimit(resource.RLIMIT_CPU ...)"?

This is a good feature to have.

Thanks for taking a look

I think this implementation will not apply a timeout to the TestRunner.py:executeShCmd codepath (which seems to be a /bin/sh emulation for windows users).

That is incorrect. The timeout is applied on all code paths when the timeout is
enabled. For the executeShCmd() (internal shell) case the timeout is
implemented in a different way to lit.util.executeCommand() (external shell)
but this definitely works (at least on Linux). I added tests cases for the three
cases: External Shell, Internal Shell and GTest.

The requirement of an external python package is unfortunate, what about things possible with built-in python libraries such as "os.kill(pid, signal.SIGKILL)" or "resource.setrlimit(resource.RLIMIT_CPU ...)"?

While it is not impossible to remove the dependency on the pustil module,
doing so would be difficult due to platform differences. The issue here is
that when killing a process, we must kill the process and its children (and
their children and so on...). If we don't do this we will have processes left
behind that can continue executing past the timeout.

This is even an issue in the most basic case where we use the external shell to
execute a single command because the parent process is the shell and the child
is the actual command we wanted to run.

There are ways to achieve killing of a group of processes on Linux (not sure
about OSX) by having the parent be in its own session (via
preexe_fn=cos.setsid()) and then later killing using os.killpg() but
that isn't available on Windows and so isn't portable. The aproach I use is
because pustil provides a way to easily iterate over a process's children
and kill them. I don't know of any cross platform way of doing this using just
Python's standard library.

I have kept the dependency on pustil confined to the
lit.util.killProcessAndChildren() function so in the future we could
potentially remove the dependency. It's not something I can do right now
though as I don't have access to a Windows or OSX machine.

cmatthews edited edge metadata.Dec 5 2015, 4:38 PM
cmatthews added a subscriber: cmatthews.

I think needing a package that is not included in the Python standard release this is a serious issue. lit has no dependencies. That makes it easy to build and test anywhere.

I think needing a package that is not included in the Python standard release this is a serious issue. lit has no dependencies. That makes it easy to build and test anywhere.

It is not a hard dependency. It is only required if you want use the per test timeout. If you don't use the timeout lit will work fine without pustil. Although it would be nice to not have psutil as a dependency I'd like to point out that:

  • Installing pustil is ridiculously easy. If you are root its just a single command`pip install psutil`, if you're not root you can just use virtualenv and then install it there.
  • Not using pustil will likely lead to less portability (at least initially). The pustil module is awesome and a lot of hard work has gone into making it work on all platforms. It's likely that our attempts to emulate this work inside lit would not be as portable.
  • lit is used by llvm developers or it is obtained via the package on PyPi. In the PyPi case pustil can be made a dependency so users will automatically get pustil. In the case for LLVM developers I believe that they are competent enough to know how to install pustil if they need it.

When I tried to implement this a year ago, it turned out not to be possible on python 2.7 (without the external library). The features needed to make it work weren't added until 3.2, I believe.

I think having psutil as a soft dependency is fine.

utils/lit/lit/TestRunner.py
582

I was looking at the diff of Diff 5 vs Diff 6. And yeah, I'd consider that trivial too.

delcypher updated this revision to Diff 43369.Dec 21 2015, 6:45 AM
delcypher updated this object.
delcypher edited edge metadata.
  • Downgrade warning about `--timeout=` command line option overriding the LitConfig setting to a note
  • To not print the per test timeout when running
  • Improve testing of `--timeout=` overriding the LitConfig setting
  • Remove implementing timeout from TODO
  • Add notes on improving per test timeout implementation in TODO
This revision was automatically updated to reflect the committed changes.