This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/utils/lit/
-
utils/
-
lit/
-
lit/
2/5
LitConfig.py
1
run.py
-
tests/
-
Inputs/setup-teardown/
-
setup-teardown/
1/2
lit.cfg
-
test1.txt
-
test2.txt
1
setup-teardown.py

Differential D76829

[lit] Introduce setup and teardown routines
AbandonedPublic

Authored by broadwaylamb on Mar 26 2020, 1:24 AM.

Download Raw Diff

Details

Reviewers

ldionne
• ddunbar
delcypher
dexonsmith
rnk
yln

Summary

This patch adds a new feature to lit. We can now define setup and teardown routines that will be run before the whole test suite starts and after it completes, respectively.

The primary use case for this feature is running std::filesystem tests in libc++ on a remote target. There, we need to copy test inputs to the target machine via SSH, run the tests on those inputs, and then, after all the tests have been run, we need to remove those test inputs from the target machine.

Diff Detail

Event Timeline

broadwaylamb created this revision.Mar 26 2020, 1:24 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 26 2020, 1:24 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster failed remote builds in B50504: Diff 252765!Mar 26 2020, 2:07 AM

Can't we do that with a custom executor in libc++?

@ldionne I tried to find a way but couldn't. All the code in executor classes is run either at configuration time (when we parse lit.cfg), or while running each test. There's no way to tell it to run "when all tests have completed", AFAICT.

In D76829#1943483, @broadwaylamb wrote:

@ldionne I tried to find a way but couldn't. All the code in executor classes is run either at configuration time (when we parse lit.cfg), or while running each test. There's no way to tell it to run "when all tests have completed", AFAICT.

The executor does receive file dependencies. It could copy those and remove them after each test, no? I'm not saying setup/teardown isn't useful in lit, I'm just trying to see what other ways we can solve that problem.

In D76829#1943659, @ldionne wrote:

In D76829#1943483, @broadwaylamb wrote:

@ldionne I tried to find a way but couldn't. All the code in executor classes is run either at configuration time (when we parse lit.cfg), or while running each test. There's no way to tell it to run "when all tests have completed", AFAICT.

The executor does receive file dependencies. It could copy those and remove them after each test, no? I'm not saying setup/teardown isn't useful in lit, I'm just trying to see what other ways we can solve that problem.

It probably could. However, it would be unnecessary work, leading to slowing down test execution time significantly. Copying the inputs involves tarring them, creating a temporary directory on the target, scp'ing the archive to that temporary directory and untarring it. For me, it can easily take ~10 seconds (yeah, my target machine is on another hemisphere, but still). And we have 134 std::filesystem tests, that's ~22 minutes spent only on copying the inputs to the target. And I haven't even counted the cleanup time.

I can already think of at least one other use case that currently doesn't have a good solution: setting up iOS simulator/devices before and after compiler-rt tests.

The only concern that I have is that in lit we have this concept of "suites" defined by lit configs. These lit config now also register setup/teardown functions, but the functions get executed at the beginning/end of the overall lit command line invocation. This is a mismatch. But maybe this is not important?! Do scenarios where setup matters already equate "lit invocation" with "test suite"?

This is what I am trying to depict:

lit command line invocation - start
  suite 1 setup
  suite 2 setup
  suite 1
    test 1 in suite 1
    test 2 in suite 1
  suite 2
    test 1 in suite 2
    test 2 in suite 2
  suite 1 teardown
  suite 2 teardown
lit command line invocation - end

Ideally, setup/teardown would execute around the suite that defined it. We can't easily accomplish this because after test discovery, a "lit run" (run.py) is just the list of all discovered tests.

Still, from my perspective this is a good feature to have. LGTM, with nits from my side. I would like someone else to agree with me before landing though.

llvm/utils/lit/lit/LitConfig.py
73	I think I can guess why this is required, but please explain it just to make sure I understand.
185	Can we name this `suite_setup` to drive home the point that this is executed once, and not for every test?
215	Should we provide self (the lit_config) or is this implicitly captured on the caller side?
llvm/utils/lit/lit/run.py
73	My apologies for this. The explicit modeling of serial and parallel runs is my fault. That abstraction has not carried its weight. I want to remove it in the future.
llvm/utils/lit/tests/Inputs/setup-teardown/lit.cfg
11	Can this access the surrounding context?
llvm/utils/lit/tests/setup-teardown.py
2	If you provide `-j1`, then you can skip the `-DAG` below.

Rename setup_callback to suite_setup, teardown_callback to suite_teardown.
Add a comment about pickling LitConfig.
Update the test to use -j1.

In D76829#1945166, @yln wrote:

Ideally, setup/teardown would execute around the suite that defined it. We can't easily accomplish this because after test discovery, a "lit run" (run.py) is just the list of all discovered tests.

Yes, I thought about it too and indeed it's not easy to do. Not only because the information about suites is "lost", but also because I think it would significantly complicate the logic in case of parallelism.
I agree that although it would be a natural thing to do, I don't see any issue if we run those routines before and after the whole lit invocation.

llvm/utils/lit/lit/LitConfig.py
215	Yes, it can be captured.
llvm/utils/lit/tests/Inputs/setup-teardown/lit.cfg
11	Yes.

LGTM. @delcypher @rnk: what do you think?

In D76829#1943847, @broadwaylamb wrote:

In D76829#1943659, @ldionne wrote:

In D76829#1943483, @broadwaylamb wrote:

@ldionne I tried to find a way but couldn't. All the code in executor classes is run either at configuration time (when we parse lit.cfg), or while running each test. There's no way to tell it to run "when all tests have completed", AFAICT.

The executor does receive file dependencies. It could copy those and remove them after each test, no? I'm not saying setup/teardown isn't useful in lit, I'm just trying to see what other ways we can solve that problem.

It probably could. However, it would be unnecessary work, leading to slowing down test execution time significantly. Copying the inputs involves tarring them, creating a temporary directory on the target, scp'ing the archive to that temporary directory and untarring it. For me, it can easily take ~10 seconds (yeah, my target machine is on another hemisphere, but still). And we have 134 std::filesystem tests, that's ~22 minutes spent only on copying the inputs to the target. And I haven't even counted the cleanup time.

FWIW, that's needed for correctness because tests could modify these files. When running locally, we would also need to copy the files to temporary directories, but we don't do it.

I'm curious to know -- what are you tarring up exactly? We have only a few small inputs, don't we?

In D76829#1946451, @ldionne wrote:

FWIW, that's needed for correctness because tests could modify these files. When running locally, we would also need to copy the files to temporary directories, but we don't do it.

If such tests are introduced (those that modify their inputs), we'll just copy their inputs only for them. But there are no such tests now.

I'm curious to know -- what are you tarring up exactly? We have only a few small inputs, don't we?

Only this directory: libcxx/test/std/input.output/filesystems/Inputs

Given the use cases described, I think what you have is probably the best way forward.

However, I'd add that setup steps are not parallelized. Any work that lit does up front slows the test suite down significantly, so in general we should prefer to do any set up with *in* the test, rather than up front. For your use case, the overhead comes from round trip network latency, not CPU time, so up front set up makes sense.

In the past, @zturner wanted to create clean output directories for every test up front, so that %T wouldn't be as dangerous. I forget if it landed, but I remember that one version of the patch did a lot of IO up front (deleting all temp files from the last test run), and he had to rework it to do the FS work during test execution.

In D76829#1946482, @broadwaylamb wrote:

In D76829#1946451, @ldionne wrote:

FWIW, that's needed for correctness because tests could modify these files. When running locally, we would also need to copy the files to temporary directories, but we don't do it.

If such tests are introduced (those that modify their inputs), we'll just copy their inputs only for them. But there are no such tests now.

Well, those are tests, so if they do the wrong thing (and fail), what tells us they didn't modify the files? We're basically relying on the tests working correctly for the test suite to be correct, which is not very good. The only way of really being correct in libc++'s case is to copy the files and run the test on those copies. The only reason we're not doing it right now is because we're being lazy, but I'm actually working on fixing that right now.

I personally believe that a more useful feature for lit would be to allow batching sets of RUN commands or something like that. For example, imagine if we could compile all the tests locally and only then execute them on the remote host -- you wouldn't only save the cost of scping N times, you would also save on all the ssh's. That would be a major improvement. To me, trying to just save on copying the inputs is taking the easy way for less benefit, and it's not clear to me this will even work with the upcoming libc++ lit configuration, where each test will require its copy of the inputs.

By the way, I don't mean to say that this patch isn't useful -- I'm just saying that in the context of trying to solve the specific problem at hand (libc++'s std::filesystem on a remote host), I'm not convinced this is the right thing.

I personally believe that a more useful feature for lit would be to allow batching sets of RUN commands or something like that. For example, imagine if we could compile all the tests locally and only then execute them on the remote host -- you wouldn't only save the cost of scping N times, you would also save on all the ssh's. That would be a major improvement. To me, trying to just save on copying the inputs is taking the easy way for less benefit, and it's not clear to me this will even work with the upcoming libc++ lit configuration, where each test will require its copy of the inputs.

By the way, I don't mean to say that this patch isn't useful -- I'm just saying that in the context of trying to solve the specific problem at hand (libc++'s std::filesystem on a remote host), I'm not convinced this is the right thing.

Without a strong use case from libc++ I would prefer to hold off on this and see if it is still needed after we made the other improvements.

In D76829#1946422, @yln wrote:

LGTM. @delcypher @rnk: what do you think?

I'd quite like to see the feature land for the reason you mentioned with the iOS simulator.

delcypher added inline comments.Mar 27 2020, 7:48 PM

llvm/utils/lit/lit/LitConfig.py
71	Is `suite` really the best name here? IIUC a lit invocation can discover multiple test suites but the callbacks in this patch aren't executed when a suite starts/finishes. IIUC the implementation calls the setup calls backs before any tests run and then calls the teardown callbacks after all tests have been executed. In that sense the callbacks are actually global to the entire lit invocation, even though the callbacks originate from a particular suite. Wouldn't `_global_setup_callbacks` (or `_global_pre_test_callbacks`) and `_global_teardown_callbacks` (or `_global_post_test_callbacks`) with similar changes elsewhere be a more descriptive name?

In D76829#1947433, @delcypher wrote:

I'd quite like to see the feature land for the reason you mentioned with the iOS simulator.

Note: we just solved this issue in a different way (by eliminating the need to boot/shutdown simulator instances).

yln resigned from this revision.Apr 24 2020, 4:19 PM

broadwaylamb abandoned this revision.Apr 24 2020, 4:49 PM

Revision Contents

Path

Size

llvm/

utils/

lit/

LitConfig.py

58 lines

run.py

9 lines

tests/

Inputs/

setup-teardown/

14 lines

1 line

1 line

9 lines

Diff 253090

llvm/utils/lit/lit/LitConfig.py

Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	def __init__(self, progname, path, quiet,
# The default is 'summary'.		# The default is 'summary'.
self.valgrindArgs.append('--leak-check=no')		self.valgrindArgs.append('--leak-check=no')
self.valgrindArgs.extend(self.valgrindUserArgs)		self.valgrindArgs.extend(self.valgrindUserArgs)

self.maxIndividualTestTime = maxIndividualTestTime		self.maxIndividualTestTime = maxIndividualTestTime
self.parallelism_groups = parallelism_groups		self.parallelism_groups = parallelism_groups
self.echo_all_commands = echo_all_commands		self.echo_all_commands = echo_all_commands

		self._suite_setup_callbacks = []
		self._suite_teardown_callbacks = []
		delcypherUnsubmitted Not Done Reply Inline Actions Is `suite` really the best name here? IIUC a lit invocation can discover multiple test suites but the callbacks in this patch aren't executed when a suite starts/finishes. IIUC the implementation calls the setup calls backs before any tests run and then calls the teardown callbacks after all tests have been executed. In that sense the callbacks are actually global to the entire lit invocation, even though the callbacks originate from a particular suite. Wouldn't `_global_setup_callbacks` (or `_global_pre_test_callbacks`) and `_global_teardown_callbacks` (or `_global_post_test_callbacks`) with similar changes elsewhere be a more descriptive name? delcypher: Is `suite` really the best name here? IIUC a lit invocation can discover multiple test suites…

		def __getstate__(self):
		ylnUnsubmitted Done Reply Inline Actions I think I can guess why this is required, but please explain it just to make sure I understand. yln: I think I can guess why this is required, but please explain it just to make sure I understand.
		# An instance of LitConfig may be shared between multiple processes
		# when using parallelism, which requires pickling that instance.
		# However, pickling function objects is not always possible (or, in this
		# case, always impossible), so we remove the callbacks from
		# pickled instances.
		# This is okay, because we only set them and run them in the parent
		# lit.py process. Accessing them from subprocesses would be weird,
		# so it's not supported.
		state = dict(self.__dict__)
		del state['_suite_setup_callbacks']
		del state['_suite_teardown_callbacks']
		return state

		def __setstate__(self, state):
		self.__dict__.update(state)
		self._suite_setup_callbacks = []
		self._suite_teardown_callbacks = []

@property		@property
def maxIndividualTestTime(self):		def maxIndividualTestTime(self):
"""		"""
Interface for getting maximum time to spend executing		Interface for getting maximum time to spend executing
a single test		a single test
"""		"""
return self._maxIndividualTestTime		return self._maxIndividualTestTime

▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	def getToolsPath(self, dir, paths, tools):
dir = lit.util.whichTools(tools, paths)		dir = lit.util.whichTools(tools, paths)

# bash		# bash
self.bashPath = lit.util.which('bash', dir)		self.bashPath = lit.util.which('bash', dir)
if self.bashPath is None:		if self.bashPath is None:
self.bashPath = ''		self.bashPath = ''

return dir		return dir

		def suite_setup(self, callback):
		ylnUnsubmitted Not Done Reply Inline Actions Can we name this `suite_setup` to drive home the point that this is executed once, and not for every test? yln: Can we name this `suite_setup` to drive home the point that this is executed once, and not for…
		'''
		Adds the callback to the list of setup callbacks that will be run
		before running the test suite.

		Can be used as a decorator in lit configuration files like this:

		@lit_config.suite_setup
		def setup():
		...
		'''
		self._suite_setup_callbacks.append(callback)
		return callback

		def suite_teardown(self, callback):
		'''
		Adds the callback to the list of teardown callbacks that will be run
		after the test suite completes.

		Can be used as a decorator in lit configuration files like this:

		@lit_config.suite_teardown
		def teardown():
		...
		'''
		self._suite_teardown_callbacks.append(callback)
		return callback

		def run_suite_setup_callbacks(self):
		for callback in self._suite_setup_callbacks:
		callback()
		ylnUnsubmitted Not Done Reply Inline Actions Should we provide self (the lit_config) or is this implicitly captured on the caller side? yln: Should we provide self (the lit_config) or is this implicitly captured on the caller side?
		broadwaylambAuthorUnsubmitted Done Reply Inline Actions Yes, it can be captured. broadwaylamb: Yes, it can be captured.

		def run_suite_teardown_callbacks(self):
		for callback in self._suite_teardown_callbacks:
		callback()

def _write_message(self, kind, message):		def _write_message(self, kind, message):
# Get the file/line where this message was generated.		# Get the file/line where this message was generated.
f = inspect.currentframe()		f = inspect.currentframe()
# Step out of _write_message, and then out of wrapper.		# Step out of _write_message, and then out of wrapper.
f = f.f_back.f_back		f = f.f_back.f_back
file,line,_,_,_ = inspect.getframeinfo(f)		file,line,_,_,_ = inspect.getframeinfo(f)
location = '%s:%d' % (file, line)		location = '%s:%d' % (file, line)

Show All 19 Lines

llvm/utils/lit/lit/run.py

Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	def execute(self):
self.failure_count = 0		self.failure_count = 0
self.hit_max_failures = False		self.hit_max_failures = False

# Larger timeouts (one year, positive infinity) don't work on Windows.		# Larger timeouts (one year, positive infinity) don't work on Windows.
one_week = 7 * 24 * 60 * 60 # days * hours * minutes * seconds		one_week = 7 * 24 * 60 * 60 # days * hours * minutes * seconds
timeout = self.timeout or one_week		timeout = self.timeout or one_week
deadline = time.time() + timeout		deadline = time.time() + timeout

		try:
		self.lit_config.run_suite_setup_callbacks()
self._execute(deadline)		self._execute(deadline)
		finally:
		self.lit_config.run_suite_teardown_callbacks()

# Mark any tests that weren't run as UNRESOLVED.		# Mark any tests that weren't run as UNRESOLVED.
for test in self.tests:		for test in self.tests:
if test.result is None:		if test.result is None:
test.setResult(lit.Test.Result(lit.Test.UNRESOLVED, '', 0.0))		test.setResult(lit.Test.Result(lit.Test.UNRESOLVED, '', 0.0))

		def _execute(self, deadline):
		raise NotImplementedError("Should be implemented in a subclass")
		ylnUnsubmitted Not Done Reply Inline Actions My apologies for this. The explicit modeling of serial and parallel runs is my fault. That abstraction has not carried its weight. I want to remove it in the future. yln: My apologies for this. The explicit modeling of serial and parallel runs is my fault. That…

# TODO(yln): as the comment says.. this is racing with the main thread waiting		# TODO(yln): as the comment says.. this is racing with the main thread waiting
# for results		# for results
def _process_result(self, test, result):		def _process_result(self, test, result):
# Don't add any more test results after we've hit the maximum failure		# Don't add any more test results after we've hit the maximum failure
# count. Otherwise we're racing with the main thread, which is going		# count. Otherwise we're racing with the main thread, which is going
# to terminate the process pool soon.		# to terminate the process pool soon.
if self.hit_max_failures:		if self.hit_max_failures:
return		return
▲ Show 20 Lines • Show All 101 Lines • Show Last 20 Lines

llvm/utils/lit/tests/Inputs/setup-teardown/lit.cfg

This file was added.

				import lit.formats
				config.name = 'setup-teardown'
				config.suffixes = ['.txt']
				config.test_format = lit.formats.ShTest()
				config.test_source_root = None
				config.test_exec_root = None

				@lit_config.suite_setup
				def setup():
				print("Running setup code...")

				ylnUnsubmitted Not Done Reply Inline Actions Can this access the surrounding context? yln: Can this access the surrounding context?
				broadwaylambAuthorUnsubmitted Done Reply Inline Actions Yes. broadwaylamb: Yes.
				@lit_config.suite_teardown
				def teardown():
				print("Running teardown code...")

llvm/utils/lit/tests/Inputs/setup-teardown/test1.txt

This file was added.

# RUN: true

llvm/utils/lit/tests/Inputs/setup-teardown/test2.txt

This file was added.

# RUN: false

llvm/utils/lit/tests/setup-teardown.py

This file was added.

				# RUN: not %{lit} -j1 %{inputs}/setup-teardown \| FileCheck %s

				ylnUnsubmitted Not Done Reply Inline Actions If you provide `-j1`, then you can skip the `-DAG` below. yln: If you provide `-j1`, then you can skip the `-DAG` below.
				# CHECK: -- Testing: 2 tests, 1 workers --
				# CHECK: Running setup code...
				# CHECK: PASS: setup-teardown :: test1.txt
				# CHECK: FAIL: setup-teardown :: test2.txt
				# CHECK: Running teardown code...
				# CHECK: Expected Passes : 1
				# CHECK: Unexpected Failures: 1