This is an archive of the discontinued LLVM Phabricator instance.

[lit] Try to remove the flakeyness of `shtest-timeout.py` and `googletest-timeout.py`.
ClosedPublic

Authored by delcypher on Oct 7 2020, 5:23 PM.

Details

Summary

The tests previously relied on the short.py and FirstTest.subTestA
script being executed on a machine within a short time window (1 or 2
seconds). While this "seems to work" it can fail on resource constrained
machines. We could bump the timeout a little bit (bumping it too
much would mean the test would take a long time to execute) but it wouldn't
really solve the problem of the test being prone to failures.

This patch tries to remove this flakeyness by separating testing into
two separate parts:

  1. Testing if a test can hit a timeout.
  2. Testing if a test can run to completion in the presence of a

timeout.

This way we can give (1.) a really short timeout (to make the test run
as fast as possible) and (2.) a really long timeout. This means for (2.)
we are no longer trying to rely on the "short" test executing within
some short time window. Instead the window is now 3600 seconds which
should be long enough even for a heavily resource constrained machine to
execute the "short" test.

Thanks to Julian Lettner for suggesting this approach. This superseeds
my original approach in https://reviews.llvm.org/D88807.

Diff Detail

Event Timeline

delcypher created this revision.Oct 7 2020, 5:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 7 2020, 5:23 PM
delcypher requested review of this revision.Oct 7 2020, 5:23 PM
yln accepted this revision.Oct 7 2020, 5:52 PM

LGTM with nits

llvm/utils/lit/tests/googletest-timeout.py
18

Possible cleanup: %t.cfgset.err seems to be unused

43

Other than checking the debug output on stderr ("Forcing timeout to be 2 seconds") this test actually does not have a reliable way of *really* checking that the timeout was 2 (and not 1). (without making it flaky as in our original problem)

So we could let this test pass immediately (make it fast!) by setting a large timeout override, checking stderr for that value ("Forcing timeout to be 3600 seconds") and use QuickSubTest to pass the test immediately.

llvm/utils/lit/tests/shtest-timeout.py
74

prescence -> presence
"Test tests" is difficult to read

This revision is now accepted and ready to land.Oct 7 2020, 5:52 PM
yln added inline comments.Oct 7 2020, 5:57 PM
llvm/utils/lit/tests/shtest-timeout.py
63–68

Same suggestion as above: only check timeout is really the debug message on stderr.

But that doesn't require the test to fail. We could immediately pass it instead of hanging for 2 seconds.

delcypher updated this revision to Diff 296848.Oct 7 2020, 6:42 PM

Address first round of feedback.

delcypher updated this revision to Diff 296849.Oct 7 2020, 6:44 PM
delcypher marked 3 inline comments as done.

Fix typo.

delcypher marked an inline comment as done.Oct 7 2020, 6:44 PM
delcypher added inline comments.
llvm/utils/lit/tests/googletest-timeout.py
43

Very true I've updated the tests to take this approach.

yln accepted this revision.Oct 7 2020, 6:50 PM
This revision was landed with ongoing or failed builds.Oct 8 2020, 10:46 AM
This revision was automatically updated to reflect the committed changes.