This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/utils/lit/
-
utils/
-
lit/
-
lit/
-
Test.py
5/6
main.py
-
run.py
-
tests/
-
Inputs/max-time/
-
max-time/
-
fast.txt
-
lit.cfg
-
slow.txt
-
max-failures.py
4/7
max-time.py

Differential D77819

[lit] Add SKIPPED test result category
ClosedPublic

Authored by yln on Apr 9 2020, 11:44 AM.

Download Raw Diff

Details

Reviewers

rnk
jdenny
serge-sans-paille
probinson

Commits

rGcbe42a9d5fa7: [lit] Add SKIPPED test result category

Summary

Track and print the number of skipped tests. Skipped tests are tests
that should have been executed but weren't due to:

user interrupt [Ctrl+C]
--max-time (overall lit timeout)
--max-failures

This is part of a larger effort to ensure that all discovered tests are
properly accounted for.

Add test for overall lit timeout feature (--max-time option) to
observe skipped tests. Extend test for --max-failures option.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

yln created this revision.Apr 9 2020, 11:44 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 9 2020, 11:44 AM

Herald added subscribers: llvm-commits, delcypher. · View Herald Transcript

Harbormaster completed remote builds in B52541: Diff 256347.Apr 9 2020, 11:56 AM

Thanks for doing this. I like the idea of the change.

Can skipped tests be covered in lit's test suite? I'm not sure how easy it is to do it in a portable manner, but platform-specific coverage is better than nothing.

llvm/utils/lit/lit/main.py
283–284	Does this todo mean you don't like the name SKIPPED?
284	There's a change in the effect of `opts.max_failures` on both skipped tests and the remaining unresolved tests. It seems like that should be mentioned in the patch summary. Can at least the latter change be covered in lit's test suite?

Improved commit message.

Add test for overall lit timeout feature (--max-time option) to
observe skipped tests.

Can skipped tests be covered in lit's test suite? I'm not sure how easy it is to do it in a portable manner, but platform-specific coverage is better than nothing.

Added a test for the overall lit timeout (--max-time) which previously didn't have one, which let's us observe the "skipped test" logic. I am not sure how to add one for the user interrupt [Ctrl+C].

llvm/utils/lit/lit/main.py
283–284	No, I just noticed that handling of FLAKYPASS is missing here and I want to address it as a separate change.

yln edited the summary of this revision. (Show Details)Apr 10 2020, 11:34 AM

Harbormaster completed remote builds in B52705: Diff 256623.Apr 10 2020, 11:38 AM

yln marked 3 inline comments as done.Apr 10 2020, 12:49 PM

yln added inline comments.

llvm/utils/lit/lit/main.py
284	UNRESOLVED tests are failures where the test failed due to an "infrastructure" issue, e.g., failure to parse a `REQUIRES:` line. Previously all unexecuted tests (unexecuted for whatever reason: infrastructure failures, overall lit timeout, user interrupt, --max-tests) were all marked UNRESOLVED. So both logical groups for unexecuted tests: "skipped" and "unresolved" were given the same label and treated as failures. However, a "skipped" shouldn't imply failure. I think the check for `opts.max_failures` here was just an ad-hoc way to dry to deal with mixing both skipped and failed tests into the UNRESOLVED category: when `--max-failures` was specified and we stop executing because of it we mark all remaining tests as UNRESOLVED. However we shouldn't print those as failures here. Anyways, things are more consistent now. We have a proper category for "skipped" (not a failure) and the interaction between max_failures and UNRESOLVED has been removed.

Extend test for --max-failures option.

yln edited the summary of this revision. (Show Details)Apr 10 2020, 1:20 PM

In D77819#1974786, @yln wrote:

Can skipped tests be covered in lit's test suite? I'm not sure how easy it is to do it in a portable manner, but platform-specific coverage is better than nothing.

Added a test for the overall lit timeout (--max-time) which previously didn't have one, which let's us observe the "skipped test" logic. I am not sure how to add one for the user interrupt [Ctrl+C].

It think it would be messy: run the lit test suite in the background redirecting to a file, record the PID, poll that file in the foreground until expected passes appear, and then kill -SIGINT. Very slow tests should be skipped. It works from my bash shell, but I'm not sure about portability.

For now, your -max-failures tests are a race-free, portable way to exercise at least one code path for skipped tests, so I would not object if you felt that was sufficient.

llvm/utils/lit/lit/main.py
283–284	Got it. Thanks.
284	Thanks for explaining.
llvm/utils/lit/tests/max-time.py
4	On heavily loaded test systems, is there a chance of a race here?

Harbormaster completed remote builds in B52722: Diff 256643.Apr 10 2020, 1:59 PM

In D77819#1974995, @jdenny wrote:

It think it would be messy: run the lit test suite in the background redirecting to a file, record the PID, poll that file in the foreground until expected passes appear, and then kill -SIGINT. Very slow tests should be skipped. It works from my bash shell, but I'm not sure about portability.

For now, your -max-failures tests are a race-free, portable way to exercise at least one code path for skipped tests, so I would not object if you felt that was sufficient.

Yes, I would like to spend my time on other improvements ;)

llvm/utils/lit/tests/max-time.py
4	Yes, race is between `--max-time=1` and `sleep 5` in [slow.txt]. I will increase 5 to 60 when landing. That should work for all practical purposes.

LGTM. Thanks!

This revision is now accepted and ready to land.Apr 10 2020, 3:04 PM

jdenny added inline comments.Apr 10 2020, 3:13 PM

llvm/utils/lit/tests/max-time.py
4	Isn't there theoretically a race between `--max-time=1` and fast.txt as well? I'm not saying you need to change it, but I at least want to be sure I'm not misunderstanding something.

yln marked an inline comment as done.Apr 10 2020, 3:22 PM

yln added inline comments.

llvm/utils/lit/tests/max-time.py
4	Yes, you are right! I forgot to consider this. I would like to keep this at 1 for now (because it increases the actual running time of the test) and only increase it if we discover that it is a problem in practice.

Closed by commit rGcbe42a9d5fa7: [lit] Add SKIPPED test result category (authored by yln). · Explain WhyApr 10 2020, 3:37 PM

This revision was automatically updated to reflect the committed changes.

In D77819#1975150, @yln wrote:

In D77819#1974995, @jdenny wrote:

It think it would be messy: run the lit test suite in the background redirecting to a file, record the PID, poll that file in the foreground until expected passes appear, and then kill -SIGINT. Very slow tests should be skipped. It works from my bash shell, but I'm not sure about portability.

For now, your -max-failures tests are a race-free, portable way to exercise at least one code path for skipped tests, so I would not object if you felt that was sufficient.

Yes, I would like to spend my time on other improvements ;)

So, if your -max-time test proves race-free in practice, my recipe above can be simplified a bit: just run lit in the background and record the PID, sleep long enough for expected passes, and then kill -SIGINT. Well, maybe one day. :-)

llvm/utils/lit/tests/max-time.py
4	I think that's fine.

yln marked an inline comment as done.Apr 10 2020, 4:20 PM

yln added inline comments.

llvm/utils/lit/tests/max-time.py
4	At least one bot actually hit this: http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/15079/steps/ninja%20check%201/logs/FAIL%3A%20lit%3A%3A%20max-time.py I will adapt the timeouts: fast.txt: true (0) --max-time: 5 <-- this is how long the test runs (bot reported test time: 1.58s, so it should be good enough) slow.txt: sleep 60 One more slow test in the lit test suite :(

yln marked an inline comment as done.Apr 10 2020, 4:26 PM

yln added inline comments.

llvm/utils/lit/tests/max-time.py
4	https://reviews.llvm.org/rG5925c4a0ff720fa85a83a44f0358da4076297651

jdenny mentioned this in D77986: [lit] Move llvm-test-suite result codes into llvm/lit.Apr 13 2020, 7:00 AM

Revision Contents

Path

Size

llvm/

utils/

lit/

Test.py

1 line

main.py

13 lines

run.py

15 lines

tests/

Inputs/

max-time/

1 line

6 lines

1 line

23 lines

7 lines

Diff 256688

llvm/utils/lit/lit/Test.py

	Show All 30 Lines
	PASS = ResultCode('PASS', False)			PASS = ResultCode('PASS', False)
	FLAKYPASS = ResultCode('FLAKYPASS', False)			FLAKYPASS = ResultCode('FLAKYPASS', False)
	XFAIL = ResultCode('XFAIL', False)			XFAIL = ResultCode('XFAIL', False)
	FAIL = ResultCode('FAIL', True)			FAIL = ResultCode('FAIL', True)
	XPASS = ResultCode('XPASS', True)			XPASS = ResultCode('XPASS', True)
	UNRESOLVED = ResultCode('UNRESOLVED', True)			UNRESOLVED = ResultCode('UNRESOLVED', True)
	UNSUPPORTED = ResultCode('UNSUPPORTED', False)			UNSUPPORTED = ResultCode('UNSUPPORTED', False)
	TIMEOUT = ResultCode('TIMEOUT', True)			TIMEOUT = ResultCode('TIMEOUT', True)
				SKIPPED = ResultCode('SKIPPED', False)

	# Test metric values.			# Test metric values.

	class MetricValue(object):			class MetricValue(object):
	def format(self):			def format(self):
	"""			"""
	format() -> str			format() -> str

	▲ Show 20 Lines • Show All 364 Lines • Show Last 20 Lines

llvm/utils/lit/lit/main.py

Show First 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	def main(builtin_params={}):
filtered_tests = filtered_tests[:opts.max_tests]		filtered_tests = filtered_tests[:opts.max_tests]

opts.workers = min(len(filtered_tests), opts.workers)		opts.workers = min(len(filtered_tests), opts.workers)

start = time.time()		start = time.time()
run_tests(filtered_tests, lit_config, opts, len(discovered_tests))		run_tests(filtered_tests, lit_config, opts, len(discovered_tests))
elapsed = time.time() - start		elapsed = time.time() - start

executed_tests = [t for t in filtered_tests if t.result]		# TODO(yln): eventually, all functions below should act on discovered_tests
		executed_tests = [
		t for t in filtered_tests if t.result.code != lit.Test.SKIPPED]

if opts.time_tests:		if opts.time_tests:
print_histogram(executed_tests)		print_histogram(executed_tests)

print_results(executed_tests, elapsed, opts)		print_results(filtered_tests, elapsed, opts)

if opts.output_path:		if opts.output_path:
#TODO(yln): pass in discovered_tests		#TODO(yln): pass in discovered_tests
write_test_results(executed_tests, lit_config, elapsed, opts.output_path)		write_test_results(executed_tests, lit_config, elapsed, opts.output_path)
if opts.xunit_output_file:		if opts.xunit_output_file:
write_test_results_xunit(executed_tests, opts)		write_test_results_xunit(executed_tests, opts)

if lit_config.numErrors:		if lit_config.numErrors:
▲ Show 20 Lines • Show All 145 Lines • ▼ Show 20 Lines
failure_codes = [		failure_codes = [
(lit.Test.UNRESOLVED, 'Unresolved Tests', 'Unresolved'),		(lit.Test.UNRESOLVED, 'Unresolved Tests', 'Unresolved'),
(lit.Test.TIMEOUT, 'Individual Timeouts', 'Timed Out'),		(lit.Test.TIMEOUT, 'Individual Timeouts', 'Timed Out'),
(lit.Test.FAIL, 'Unexpected Failures', 'Failing'),		(lit.Test.FAIL, 'Unexpected Failures', 'Failing'),
(lit.Test.XPASS, 'Unexpected Passes', 'Unexpected Passing')		(lit.Test.XPASS, 'Unexpected Passes', 'Unexpected Passing')
]		]

all_codes = [		all_codes = [
		(lit.Test.SKIPPED, 'Skipped Tests', 'Skipped'),
(lit.Test.UNSUPPORTED, 'Unsupported Tests', 'Unsupported'),		(lit.Test.UNSUPPORTED, 'Unsupported Tests', 'Unsupported'),
(lit.Test.PASS, 'Expected Passes', ''),		(lit.Test.PASS, 'Expected Passes', ''),
(lit.Test.FLAKYPASS, 'Passes With Retry', ''),		(lit.Test.FLAKYPASS, 'Passes With Retry', ''),
(lit.Test.XFAIL, 'Expected Failures', 'Expected Failing'),		(lit.Test.XFAIL, 'Expected Failures', 'Expected Failing'),
] + failure_codes		] + failure_codes


def print_results(tests, elapsed, opts):		def print_results(tests, elapsed, opts):
tests_by_code = {code: [] for (code, _, _) in all_codes}		tests_by_code = {code: [] for (code, _, _) in all_codes}
for test in tests:		for test in tests:
tests_by_code[test.result.code].append(test)		tests_by_code[test.result.code].append(test)

for (code, _, group_label) in all_codes:		for (code, _, group_label) in all_codes:
print_group(code, group_label, tests_by_code[code], opts)		print_group(code, group_label, tests_by_code[code], opts)

print_summary(tests_by_code, opts.quiet, elapsed)		print_summary(tests_by_code, opts.quiet, elapsed)


def print_group(code, label, tests, opts):		def print_group(code, label, tests, opts):
if not tests:		if not tests:
return		return
if code == lit.Test.PASS:		# TODO(yln): FLAKYPASS? Make this more consistent!
		if code in {lit.Test.SKIPPED, lit.Test.PASS}:
		jdennyUnsubmitted Done Reply Inline Actions Does this todo mean you don't like the name SKIPPED? jdenny: Does this todo mean you don't like the name SKIPPED?
		ylnAuthorUnsubmitted Not Done Reply Inline Actions No, I just noticed that handling of FLAKYPASS is missing here and I want to address it as a separate change. yln: No, I just noticed that handling of FLAKYPASS is missing here and I want to address it as a…
		jdennyUnsubmitted Done Reply Inline Actions Got it. Thanks. jdenny: Got it. Thanks.
return		return
if (lit.Test.XFAIL == code and not opts.show_xfail) or \		if (lit.Test.XFAIL == code and not opts.show_xfail) or \
(lit.Test.UNSUPPORTED == code and not opts.show_unsupported) or \		(lit.Test.UNSUPPORTED == code and not opts.show_unsupported):
(lit.Test.UNRESOLVED == code and (opts.max_failures is not None)):
jdennyUnsubmitted Done Reply Inline Actions There's a change in the effect of `opts.max_failures` on both skipped tests and the remaining unresolved tests. It seems like that should be mentioned in the patch summary. Can at least the latter change be covered in lit's test suite? jdenny: There's a change in the effect of `opts.max_failures` on both skipped tests and the remaining…
ylnAuthorUnsubmitted Done Reply Inline Actions UNRESOLVED tests are failures where the test failed due to an "infrastructure" issue, e.g., failure to parse a `REQUIRES:` line. Previously all unexecuted tests (unexecuted for whatever reason: infrastructure failures, overall lit timeout, user interrupt, --max-tests) were all marked UNRESOLVED. So both logical groups for unexecuted tests: "skipped" and "unresolved" were given the same label and treated as failures. However, a "skipped" shouldn't imply failure. I think the check for `opts.max_failures` here was just an ad-hoc way to dry to deal with mixing both skipped and failed tests into the UNRESOLVED category: when `--max-failures` was specified and we stop executing because of it we mark all remaining tests as UNRESOLVED. However we shouldn't print those as failures here. Anyways, things are more consistent now. We have a proper category for "skipped" (not a failure) and the interaction between max_failures and UNRESOLVED has been removed. yln: UNRESOLVED tests are failures where the test failed due to an "infrastructure" issue, e.g.
jdennyUnsubmitted Done Reply Inline Actions Thanks for explaining. jdenny: Thanks for explaining.
return		return
print('' 20)		print('' 20)
print('%s Tests (%d):' % (label, len(tests)))		print('%s Tests (%d):' % (label, len(tests)))
for test in tests:		for test in tests:
print(' %s' % test.getFullName())		print(' %s' % test.getFullName())
sys.stdout.write('\n')		sys.stdout.write('\n')


▲ Show 20 Lines • Show All 112 Lines • Show Last 20 Lines

llvm/utils/lit/lit/run.py

Show All 36 Lines	def execute(self):

If timeout is non-None, it should be a time in seconds after which to		If timeout is non-None, it should be a time in seconds after which to
stop executing tests.		stop executing tests.

Returns the elapsed testing time.		Returns the elapsed testing time.

Upon completion, each test in the run will have its result		Upon completion, each test in the run will have its result
computed. Tests which were not actually executed (for any reason) will		computed. Tests which were not actually executed (for any reason) will
be given an UNRESOLVED result.		be marked SKIPPED.
"""		"""
self.failures = 0		self.failures = 0

# Larger timeouts (one year, positive infinity) don't work on Windows.		# Larger timeouts (one year, positive infinity) don't work on Windows.
one_week = 7 * 24 * 60 * 60 # days * hours * minutes * seconds		one_week = 7 * 24 * 60 * 60 # days * hours * minutes * seconds
timeout = self.timeout or one_week		timeout = self.timeout or one_week
deadline = time.time() + timeout		deadline = time.time() + timeout

		try:
self._execute(deadline)		self._execute(deadline)
		finally:
# Mark any tests that weren't run as UNRESOLVED.		skipped = lit.Test.Result(lit.Test.SKIPPED)
for test in self.tests:		for test in self.tests:
if test.result is None:		if test.result is None:
test.setResult(lit.Test.Result(lit.Test.UNRESOLVED, '', 0.0))		test.setResult(skipped)

def _execute(self, deadline):		def _execute(self, deadline):
self._increase_process_limit()		self._increase_process_limit()

semaphores = {k: multiprocessing.BoundedSemaphore(v)		semaphores = {k: multiprocessing.BoundedSemaphore(v)
for k, v in self.lit_config.parallelism_groups.items()		for k, v in self.lit_config.parallelism_groups.items()
if v is not None}		if v is not None}

▲ Show 20 Lines • Show All 55 Lines • Show Last 20 Lines

llvm/utils/lit/tests/Inputs/max-time/fast.txt

This file was added.

RUN: true

llvm/utils/lit/tests/Inputs/max-time/lit.cfg

This file was added.

				import lit.formats
				config.name = 'lit-time'
				config.suffixes = ['.txt']
				config.test_format = lit.formats.ShTest()
				config.test_source_root = None
				config.test_exec_root = None

llvm/utils/lit/tests/Inputs/max-time/slow.txt

This file was added.

RUN: sleep 60

llvm/utils/lit/tests/max-failures.py

	# Check the behavior of --max-failures option.			# Check the behavior of --max-failures option.
	#			#
	# RUN: not %{lit} -j 1 -v %{inputs}/max-failures > %t.out			# RUN: not %{lit} -j 1 %{inputs}/max-failures > %t.out 2>&1
	# RUN: not %{lit} --max-failures=1 -j 1 -v %{inputs}/max-failures >> %t.out			# RUN: not %{lit} --max-failures=1 -j 1 %{inputs}/max-failures >> %t.out 2>&1
	# RUN: not %{lit} --max-failures=2 -j 1 -v %{inputs}/max-failures >> %t.out			# RUN: not %{lit} --max-failures=2 -j 1 %{inputs}/max-failures >> %t.out 2>&1
	# RUN: not %{lit} --max-failures=0 -j 1 -v %{inputs}/max-failures 2>> %t.out			# RUN: not %{lit} --max-failures=0 -j 1 %{inputs}/max-failures 2>> %t.out
	# RUN: FileCheck < %t.out %s			# RUN: FileCheck < %t.out %s
	#			#
	# END.			# END.

	# CHECK: Failing Tests (35)			# CHECK-NOT: reached maximum number of test failures
	# CHECK: Failing Tests (1)			# CHECK-NOT: Skipped Tests
	# CHECK: Failing Tests (2)			# CHECK: Unexpected Failures: 35

				# CHECK: reached maximum number of test failures, skipping remaining tests
				# CHECK: Skipped Tests : 41
				# CHECK: Unexpected Failures: 1

				# CHECK: reached maximum number of test failures, skipping remaining tests
				# CHECK: Skipped Tests : 40
				# CHECK: Unexpected Failures: 2

	# CHECK: error: argument --max-failures: requires positive integer, but found '0'			# CHECK: error: argument --max-failures: requires positive integer, but found '0'

llvm/utils/lit/tests/max-time.py

This file was added.

				# Test overall lit timeout (--max-time).
				#
				# RUN: %{lit} %{inputs}/max-time --max-time=1 2>&1 \| FileCheck %s

				jdennyUnsubmitted Not Done Reply Inline Actions On heavily loaded test systems, is there a chance of a race here? jdenny: On heavily loaded test systems, is there a chance of a race here?
				ylnAuthorUnsubmitted Done Reply Inline Actions Yes, race is between `--max-time=1` and `sleep 5` in [slow.txt]. I will increase 5 to 60 when landing. That should work for all practical purposes. yln: Yes, race is between `--max-time=1` and `sleep 5` in [slow.txt]. I will increase 5 to 60 when…
				jdennyUnsubmitted Not Done Reply Inline Actions Isn't there theoretically a race between `--max-time=1` and fast.txt as well? I'm not saying you need to change it, but I at least want to be sure I'm not misunderstanding something. jdenny: Isn't there theoretically a race between `--max-time=1` and fast.txt as well? I'm not saying…
				ylnAuthorUnsubmitted Done Reply Inline Actions Yes, you are right! I forgot to consider this. I would like to keep this at 1 for now (because it increases the actual running time of the test) and only increase it if we discover that it is a problem in practice. yln: Yes, you are right! I forgot to consider this. I would like to keep this at 1 for now (because…
				jdennyUnsubmitted Not Done Reply Inline Actions I think that's fine. jdenny: I think that's fine.
				ylnAuthorUnsubmitted Done Reply Inline Actions At least one bot actually hit this: http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/15079/steps/ninja%20check%201/logs/FAIL%3A%20lit%3A%3A%20max-time.py I will adapt the timeouts: fast.txt: true (0) --max-time: 5 <-- this is how long the test runs (bot reported test time: 1.58s, so it should be good enough) slow.txt: sleep 60 One more slow test in the lit test suite :( yln: At least one bot actually hit this: http://lab.llvm.org:8011/builders/clang-cmake-armv7…
				ylnAuthorUnsubmitted Done Reply Inline Actions https://reviews.llvm.org/rG5925c4a0ff720fa85a83a44f0358da4076297651 yln: https://reviews.llvm.org/rG5925c4a0ff720fa85a83a44f0358da4076297651
				# CHECK: reached timeout, skipping remaining tests
				# CHECK: Skipped Tests : 1
				# CHECK: Expected Passes: 1

This is an archive of the discontinued LLVM Phabricator instance.

[lit] Add SKIPPED test result categoryClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 256688

llvm/utils/lit/lit/Test.py

llvm/utils/lit/lit/main.py

llvm/utils/lit/lit/run.py

llvm/utils/lit/tests/Inputs/max-time/fast.txt

llvm/utils/lit/tests/Inputs/max-time/lit.cfg

llvm/utils/lit/tests/Inputs/max-time/slow.txt

llvm/utils/lit/tests/max-failures.py

llvm/utils/lit/tests/max-time.py

[lit] Add SKIPPED test result category
ClosedPublic