This is an archive of the discontinued LLVM Phabricator instance.

Documentation for test timeout
ClosedPublic

Authored by chaoren on Dec 15 2014, 4:18 PM.

Details

Summary

Moved the timeout documentation to the correct location, and added it to the help string. Also forgot to check for the no timeout case.

Should subsequent lines of the help string be indented?

Diff Detail

Event Timeline

chaoren updated this revision to Diff 17310.Dec 15 2014, 4:18 PM
chaoren retitled this revision from to Documentation for test timeout.
chaoren updated this object.
chaoren edited the test plan for this revision. (Show Details)
chaoren added reviewers: vharron, zturner.
chaoren added a subscriber: Unknown Object (MLST).
zturner edited edge metadata.Dec 15 2014, 4:26 PM

Looks ok to me, but in the (hopefully near) future I would like to see LLDB_TEST_TIMEOUT change from an environment variable to an argument to dotest.py. Correct use of environment variables requires having intimate knowledge of the code in order to use correctly, increasing the code complexity. Having the value be a command line option means it's documented clearly for anyone just by running dotest.py --help

Can dosep supply different arguments to different invocations of dotest?
Otherwise, every test must run with the same time limit under dosep if we
supply timeout as an argument.

Isn't that the case with environment variable as well? Otherwise you would
need some kind of per-test environment variable. Basically any environment
variable set before you run dotest or dosep is global state, and so is any
command line option you pass in, so they seem functionally equivalent to
me, just a matter of how they are specified

This is how I handle it:

+Override the time limit for individual tests with LLDB_[TEST NAME]_TIMEOUT.
+E.g., LLDB_TESTCONCURRENTEVENTS_TIMEOUT=2m

That could still be solved with command line argument. For example:

--default-timeout=5m --test-timeout={'TestConcurrentEvents': '2m'}

Where the second one is a dictionary that can be directly parsed by python.
This also has the advantage of being easy to set in CMake, so that ninja
check-lldb will use reasonable timeouts.

I have to wonder though, is the ability to override test timeout worth the
added complexity (regardless of whether it uses environment variable or
command line)? How frequently will this be used? If a test is special
enough to want to override its timeout, it seems like maybe that should be
a property of the test, and not the person running the test. Imagine the
setup required to do this. It really complicates things over the current
state of "just run dotest.py" to ave to set all these environment variables
before running the script.

So i vote for one global timeout, and if it has to be overridable then an
attribute on the test class/method like @maxTimeout('2m') or
@minTimeout('10m') which would combine with the default.

Thoughts?

emaste added a subscriber: emaste.Dec 18 2014, 5:51 AM

Can dosep supply different arguments to different invocations of dotest?
Otherwise, every test must run with the same time limit under dosep if we
supply timeout as an argument.

Although we could solve this with some of the suggestions above, I'm not sure it's that important as long as the global timeout is long enough to accommodate all of the tests and short enough that the wait is reasonable if there's a stuck test. 5 minutes seems like it would accomplish that.

My concern, and the reason I brought it up, is because it seems like
unnecessary complexity which only benefits a handful of people. I don't
like adding complexity that isn't useful for a wide audience.

If this is just for making your own local test suite terminate in a
reasonable amount of time, then you can do that with an external shell
script which you run when you notice it's hung. If the issue is not knowing
it's hung, then we should change the test runner to print its output as the
test suite runs instead of at the end (we should do this anyway actually
because that's very useful).

And in addition to both of those, we really should be either fixing the
tests or disabling them. Now that this timeout is in, i have a feeling that
fixing the hanging tests will be de-prioritized.

Anyway, this patch is going in (or has gone in) for now, but I really don't
want to see this timeout logic get any more complicated in its current form.

vharron edited edge metadata.Dec 18 2014, 8:34 AM

And in addition to both of those, we really should be either fixing the

tests or disabling them. Now that this timeout is in, i have a feeling that
fixing the hanging tests will be de-prioritized.

Currently there are tests that only hang on the debian buildbot. These
hangs obscure all other test failures. This patch was the only way to
figure out what those tests are. Unfortunately, they are not hanging
locally for us and are therefore not at the top of our priority list. We
could mark them as XFail, but we still need the timeout in in case this
happens again.

Remember, without the timeout feature, if a single test hangs, you need to
surgically delete the hung test or you lose all test results.

Just to be clear, im totally on board with the global timeout. It's the
per-test override I'm not crazy about, since it doesn't matter much for the
bot, and it's non-discoverable for the user.

This patch, in fact =)

test/dosep.py
8

You haven't said anything about setting env vars.

14

E.g. LLDB_TEST_TIMEOUT=0

140

Again, make sure you call out that these are environment variables.

chaoren updated this revision to Diff 17528.Dec 19 2014, 6:32 PM
chaoren edited edge metadata.

Actually mention that these are env vars as per Vince's suggestion.

vharron accepted this revision.Dec 29 2014, 2:34 PM
vharron edited edge metadata.
This revision is now accepted and ready to land.Dec 29 2014, 2:34 PM
chaoren closed this revision.Jan 7 2015, 2:14 PM