This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/utils/lit/tests/
-
utils/
-
lit/
-
tests/
-
Inputs/
-
googletest-timeout/DummySubDir/
-
DummySubDir/
1/2
OneTest.py
-
shtest-timeout/
-
short.py
-
googletest-timeout.py
-
shtest-timeout.py

Differential D88807

[lit] Try to remove the flakeyness of `shtest-timeout.py` and `googletest-timeout.py`.
AbandonedPublic

Authored by delcypher on Oct 4 2020, 2:42 PM.

Download Raw Diff

Details

Reviewers

dblaikie
yln
jroelofs
cmatthews
MatzeB
mehdi_amini

Summary

The tests previously relied on the short.py and FirstTest.subTestA
script being executed on a machine within a short time window (1 or 2
seconds). While this "seems to work" it can fail on resource constrained
machines. We could bump the timeout a little bit (bumping it too
much would mean the test would take a long time to execute) but it wouldn't
really solve the problem of the test being prone to failures.

Instead this patch tries to make the tests less flakeys by no longer
running short.py and FirstTest.subTestA. The trade-off here is that
this means we no longer test if it is possible for a test to complete
execution when a timeout is set.

This seems like the right trade-off right now because debugging this
flakey test is not a good use of engineering time.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	1,130 ms	linux > libarcher.parallel::parallel-nosuppression.c

Event Timeline

delcypher created this revision.Oct 4 2020, 2:42 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 4 2020, 2:42 PM

delcypher requested review of this revision.Oct 4 2020, 2:42 PM

Harbormaster completed remote builds in B73935: Diff 296080.Oct 4 2020, 2:54 PM

dblaikie added inline comments.Oct 4 2020, 6:38 PM

llvm/utils/lit/tests/Inputs/googletest-timeout/DummySubDir/OneTest.py
20–27	If these (subTestB and subTestC) do the same thing - should one of them be removed to avoid redundancy? (and/or was the intent to keep the not-slow test, A (& one of only B or C) - if it is sufficiently less flakey?)

Flaky tests are the worst! Thanks for working on this :)

I think the principled way here would be to have two separate lit invocations for timeout fail/pass:

not lit --timeout=1 infinite_hang.py
# Will hang for 1 second, then aborted by timeout.

lit --timeout=999 short.py
# Will take however long it takes to do `short.py`, then pass. 999 just means "very long timeout"
# This will be immediate on normal hosts, and whatever time it takes on resource constrained hosts.

Or am I missing something?

In D88807#2312194, @yln wrote:
Flaky tests are the worst! Thanks for working on this :)

I think the principled way here would be to have two separate lit invocations for timeout fail/pass:
not lit --timeout=1 infinite_hang.py
# Will hang for 1 second, then aborted by timeout.

lit --timeout=999 short.py
# Will take however long it takes to do `short.py`, then pass. 999 just means "very long timeout"
# This will be immediate on normal hosts, and whatever time it takes on resource constrained hosts.
Or am I missing something?

You're not missing anything. This is a very good idea and I wish I had thought of it. I'll update the patch to take the approach you've outlined.

llvm/utils/lit/tests/Inputs/googletest-timeout/DummySubDir/OneTest.py
20–27	Probably but @yln 's comments mean that I might do things very differently.

Joe_Nash added a subscriber: Joe_Nash.Oct 7 2020, 1:30 PM

delcypher mentioned this in D89020: [lit] Try to remove the flakeyness of `shtest-timeout.py` and `googletest-timeout.py`..Oct 7 2020, 5:23 PM

This patch is superseded by https://reviews.llvm.org/D89020.

delcypher mentioned this in rG295d4e420fd0: [lit] Try to remove the flakeyness of `shtest-timeout.py` and `googletest….Oct 8 2020, 10:46 AM

Revision Contents

Path

Size

llvm/

utils/

lit/

tests/

Inputs/

googletest-timeout/

DummySubDir/

OneTest.py

14 lines

shtest-timeout/

short.py

googletest-timeout.py

2 lines

shtest-timeout.py

29 lines

Diff 296080

llvm/utils/lit/tests/Inputs/googletest-timeout/DummySubDir/OneTest.py

	#!/usr/bin/env python			#!/usr/bin/env python

	import sys			import sys
	import time			import time

	if len(sys.argv) != 2:			if len(sys.argv) != 2:
	raise ValueError("unexpected number of args")			raise ValueError("unexpected number of args")

	if sys.argv[1] == "--gtest_list_tests":			if sys.argv[1] == "--gtest_list_tests":
	print("""\			print("""\
	FirstTest.			FirstTest.
	subTestA
	subTestB			subTestB
	subTestC			subTestC
	""")			""")
	sys.exit(0)			sys.exit(0)
	elif not sys.argv[1].startswith("--gtest_filter="):			elif not sys.argv[1].startswith("--gtest_filter="):
	raise ValueError("unexpected argument: %r" % (sys.argv[1]))			raise ValueError("unexpected argument: %r" % (sys.argv[1]))

	test_name = sys.argv[1].split('=',1)[1]			test_name = sys.argv[1].split('=',1)[1]
	if test_name == 'FirstTest.subTestA':			if test_name == 'FirstTest.subTestB':
	print('I am subTest A, I PASS')			print('I am subTest B, I will hang')
	print('[ PASSED ] 1 test.')			while True:
	sys.exit(0)			pass
	elif test_name == 'FirstTest.subTestB':
	print('I am subTest B, I am slow')
	time.sleep(6)
	print('[ PASSED ] 1 test.')
	sys.exit(0)
	elif test_name == 'FirstTest.subTestC':			elif test_name == 'FirstTest.subTestC':
	print('I am subTest C, I will hang')			print('I am subTest C, I will hang')
	while True:			while True:
	pass			pass
				dblaikieUnsubmitted Not Done Reply Inline Actions If these (subTestB and subTestC) do the same thing - should one of them be removed to avoid redundancy? (and/or was the intent to keep the not-slow test, A (& one of only B or C) - if it is sufficiently less flakey?) dblaikie: If these (subTestB and subTestC) do the same thing - should one of them be removed to avoid…
				delcypherAuthorUnsubmitted Done Reply Inline Actions Probably but @yln 's comments mean that I might do things very differently. delcypher: Probably but @yln 's comments mean that I might do things very differently.
	else:			else:
	raise SystemExit("error: invalid test name: %r" % (test_name,))			raise SystemExit("error: invalid test name: %r" % (test_name,))

llvm/utils/lit/tests/Inputs/shtest-timeout/short.py

This file was deleted.

	# RUN: %{python} %s
	from __future__ import print_function

	print("short program")

llvm/utils/lit/tests/googletest-timeout.py

	# REQUIRES: lit-max-individual-test-time			# REQUIRES: lit-max-individual-test-time

	# Check that the per test timeout is enforced when running GTest tests.			# Check that the per test timeout is enforced when running GTest tests.
	#			#
	# RUN: not %{lit} -j 1 -v %{inputs}/googletest-timeout --timeout=1 > %t.cmd.out			# RUN: not %{lit} -j 1 -v %{inputs}/googletest-timeout --timeout=1 > %t.cmd.out
	# RUN: FileCheck < %t.cmd.out %s			# RUN: FileCheck < %t.cmd.out %s

	# Check that the per test timeout is enforced when running GTest tests via			# Check that the per test timeout is enforced when running GTest tests via
	# the configuration file			# the configuration file
	#			#
	# RUN: not %{lit} -j 1 -v %{inputs}/googletest-timeout \			# RUN: not %{lit} -j 1 -v %{inputs}/googletest-timeout \
	# RUN: --param set_timeout=1 > %t.cfgset.out 2> %t.cfgset.err			# RUN: --param set_timeout=1 > %t.cfgset.out 2> %t.cfgset.err
	# RUN: FileCheck < %t.cfgset.out %s			# RUN: FileCheck < %t.cfgset.out %s

	# CHECK: -- Testing:			# CHECK: -- Testing:
	# CHECK: PASS: googletest-timeout :: {{[Dd]ummy[Ss]ub[Dd]ir}}/OneTest.py/FirstTest.subTestA
	# CHECK: TIMEOUT: googletest-timeout :: {{[Dd]ummy[Ss]ub[Dd]ir}}/OneTest.py/FirstTest.subTestB			# CHECK: TIMEOUT: googletest-timeout :: {{[Dd]ummy[Ss]ub[Dd]ir}}/OneTest.py/FirstTest.subTestB
	# CHECK: TIMEOUT: googletest-timeout :: {{[Dd]ummy[Ss]ub[Dd]ir}}/OneTest.py/FirstTest.subTestC			# CHECK: TIMEOUT: googletest-timeout :: {{[Dd]ummy[Ss]ub[Dd]ir}}/OneTest.py/FirstTest.subTestC
	# CHECK: Passed : 1
	# CHECK: Timed Out: 2			# CHECK: Timed Out: 2

	# Test per test timeout via a config file and on the command line.			# Test per test timeout via a config file and on the command line.
	# The value set on the command line should override the config file.			# The value set on the command line should override the config file.
	# RUN: not %{lit} -j 1 -v %{inputs}/googletest-timeout \			# RUN: not %{lit} -j 1 -v %{inputs}/googletest-timeout \
	# RUN: --param set_timeout=1 --timeout=2 > %t.cmdover.out 2> %t.cmdover.err			# RUN: --param set_timeout=1 --timeout=2 > %t.cmdover.out 2> %t.cmdover.err
	# RUN: FileCheck < %t.cmdover.out %s			# RUN: FileCheck < %t.cmdover.out %s
	# RUN: FileCheck --check-prefix=CHECK-CMDLINE-OVERRIDE-ERR < %t.cmdover.err %s			# RUN: FileCheck --check-prefix=CHECK-CMDLINE-OVERRIDE-ERR < %t.cmdover.err %s

	# CHECK-CMDLINE-OVERRIDE-ERR: Forcing timeout to be 2 seconds			# CHECK-CMDLINE-OVERRIDE-ERR: Forcing timeout to be 2 seconds

llvm/utils/lit/tests/shtest-timeout.py

	# REQUIRES: lit-max-individual-test-time			# REQUIRES: lit-max-individual-test-time

	# llvm.org/PR33944			# llvm.org/PR33944
	# UNSUPPORTED: system-windows			# UNSUPPORTED: system-windows

	# FIXME: This test is fragile because it relies on time which can
	# be affected by system performance. In particular we are currently
	# assuming that `short.py` can be successfully executed within 2
	# seconds of wallclock time.

	# Test per test timeout using external shell			# Test per test timeout using external shell
	# RUN: not %{lit} \			# RUN: not %{lit} \
	# RUN: %{inputs}/shtest-timeout/infinite_loop.py \			# RUN: %{inputs}/shtest-timeout/infinite_loop.py \
	# RUN: %{inputs}/shtest-timeout/short.py \			# RUN: -j 1 -v --debug --timeout 1 --param external=1 > %t.extsh.out 2> %t.extsh.err
	# RUN: -j 1 -v --debug --timeout 2 --param external=1 > %t.extsh.out 2> %t.extsh.err
	# RUN: FileCheck --check-prefix=CHECK-OUT-COMMON < %t.extsh.out %s			# RUN: FileCheck --check-prefix=CHECK-OUT-COMMON < %t.extsh.out %s
	# RUN: FileCheck --check-prefix=CHECK-EXTSH-ERR < %t.extsh.err %s			# RUN: FileCheck --check-prefix=CHECK-EXTSH-ERR < %t.extsh.err %s
	#			#
	# CHECK-EXTSH-ERR: Using external shell			# CHECK-EXTSH-ERR: Using external shell

	# Test per test timeout using internal shell			# Test per test timeout using internal shell
	# RUN: not %{lit} \			# RUN: not %{lit} \
	# RUN: %{inputs}/shtest-timeout/infinite_loop.py \			# RUN: %{inputs}/shtest-timeout/infinite_loop.py \
	# RUN: %{inputs}/shtest-timeout/short.py \			# RUN: -j 1 -v --debug --timeout 1 --param external=0 > %t.intsh.out 2> %t.intsh.err
	# RUN: -j 1 -v --debug --timeout 2 --param external=0 > %t.intsh.out 2> %t.intsh.err
	# RUN: FileCheck --check-prefix=CHECK-OUT-COMMON < %t.intsh.out %s			# RUN: FileCheck --check-prefix=CHECK-OUT-COMMON < %t.intsh.out %s
	# RUN: FileCheck --check-prefix=CHECK-INTSH-OUT < %t.intsh.out %s			# RUN: FileCheck --check-prefix=CHECK-INTSH-OUT < %t.intsh.out %s
	# RUN: FileCheck --check-prefix=CHECK-INTSH-ERR < %t.intsh.err %s			# RUN: FileCheck --check-prefix=CHECK-INTSH-ERR < %t.intsh.err %s

	# CHECK-INTSH-OUT: TIMEOUT: per_test_timeout :: infinite_loop.py			# CHECK-INTSH-OUT: TIMEOUT: per_test_timeout :: infinite_loop.py
	# CHECK-INTSH-OUT: command output:			# CHECK-INTSH-OUT: command output:
	# CHECK-INTSH-OUT: command reached timeout: True			# CHECK-INTSH-OUT: command reached timeout: True

	# CHECK-INTSH-ERR: Using internal shell			# CHECK-INTSH-ERR: Using internal shell

	# Test per test timeout set via a config file rather than on the command line			# Test per test timeout set via a config file rather than on the command line
	# RUN: not %{lit} \			# RUN: not %{lit} \
	# RUN: %{inputs}/shtest-timeout/infinite_loop.py \			# RUN: %{inputs}/shtest-timeout/infinite_loop.py \
	# RUN: %{inputs}/shtest-timeout/short.py \
	# RUN: -j 1 -v --debug --param external=0 \			# RUN: -j 1 -v --debug --param external=0 \
	# RUN: --param set_timeout=2 > %t.cfgset.out 2> %t.cfgset.err			# RUN: --param set_timeout=1 > %t.cfgset.out 2> %t.cfgset.err
	# RUN: FileCheck --check-prefix=CHECK-OUT-COMMON < %t.cfgset.out %s			# RUN: FileCheck --check-prefix=CHECK-OUT-COMMON < %t.cfgset.out %s
	# RUN: FileCheck --check-prefix=CHECK-CFGSET-ERR < %t.cfgset.err %s			# RUN: FileCheck --check-prefix=CHECK-CFGSET-ERR < %t.cfgset.err %s
	#			#
	# CHECK-CFGSET-ERR: Using internal shell			# CHECK-CFGSET-ERR: Using internal shell

	# CHECK-OUT-COMMON: TIMEOUT: per_test_timeout :: infinite_loop.py			# CHECK-OUT-COMMON: TIMEOUT: per_test_timeout :: infinite_loop.py
	# CHECK-OUT-COMMON: Timeout: Reached timeout of 2 seconds			# CHECK-OUT-COMMON: Timeout: Reached timeout of 1 seconds
	# CHECK-OUT-COMMON: Command {{([0-9]+ )?}}Output			# CHECK-OUT-COMMON: Command {{([0-9]+ )?}}Output

	# CHECK-OUT-COMMON: PASS: per_test_timeout :: short.py

	# CHECK-OUT-COMMON: Passed : 1
	# CHECK-OUT-COMMON: Timed Out: 1			# CHECK-OUT-COMMON: Timed Out: 1

	# Test per test timeout via a config file and on the command line.			# Test per test timeout via a config file and on the command line.
	# The value set on the command line should override the config file.			# The value set on the command line should override the config file.
	# RUN: not %{lit} \			# RUN: not %{lit} \
	# RUN: %{inputs}/shtest-timeout/infinite_loop.py \			# RUN: %{inputs}/shtest-timeout/infinite_loop.py \
	# RUN: %{inputs}/shtest-timeout/short.py \
	# RUN: -j 1 -v --debug --param external=0 \			# RUN: -j 1 -v --debug --param external=0 \
	# RUN: --param set_timeout=1 --timeout=2 > %t.cmdover.out 2> %t.cmdover.err			# RUN: --param set_timeout=3 --timeout=1 > %t.cmdover.out 2> %t.cmdover.err
	# RUN: FileCheck --check-prefix=CHECK-CMDLINE-OVERRIDE-OUT < %t.cmdover.out %s			# RUN: FileCheck --check-prefix=CHECK-CMDLINE-OVERRIDE-OUT < %t.cmdover.out %s
	# RUN: FileCheck --check-prefix=CHECK-CMDLINE-OVERRIDE-ERR < %t.cmdover.err %s			# RUN: FileCheck --check-prefix=CHECK-CMDLINE-OVERRIDE-ERR < %t.cmdover.err %s

	# CHECK-CMDLINE-OVERRIDE-ERR: Forcing timeout to be 2 seconds			# CHECK-CMDLINE-OVERRIDE-ERR: Forcing timeout to be 1 seconds

	# CHECK-CMDLINE-OVERRIDE-OUT: TIMEOUT: per_test_timeout :: infinite_loop.py			# CHECK-CMDLINE-OVERRIDE-OUT: TIMEOUT: per_test_timeout :: infinite_loop.py
	# CHECK-CMDLINE-OVERRIDE-OUT: Timeout: Reached timeout of 2 seconds			# CHECK-CMDLINE-OVERRIDE-OUT: Timeout: Reached timeout of 1 seconds
	# CHECK-CMDLINE-OVERRIDE-OUT: Command {{([0-9]+ )?}}Output			# CHECK-CMDLINE-OVERRIDE-OUT: Command {{([0-9]+ )?}}Output

	# CHECK-CMDLINE-OVERRIDE-OUT: PASS: per_test_timeout :: short.py

	# CHECK-CMDLINE-OVERRIDE-OUT: Passed : 1
	# CHECK-CMDLINE-OVERRIDE-OUT: Timed Out: 1			# CHECK-CMDLINE-OVERRIDE-OUT: Timed Out: 1