This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
test/
1
dosep.py
-
test_runner/
-
README.txt
-
lib/
-
lldb_utils.py
6
process_control.py
-
test/
-
inferior.py
6/6
process_control_tests.py

Differential D13124

test runner: switch to pure-Python timeout mechanism
ClosedPublic

Authored by tfiala on Sep 23 2015, 11:08 PM.

Download Raw Diff

Details

Reviewers

zturner
tfiala
labath

Summary

For all the parallel test runners, provide a pure-Python mechanism for timing out dotest inferior processes that run for too long.

Stock OS X and Windows do not have a built-in timeout helper app. This allows those systems to support timeouts. It also transforms the timeout from being implemented via a system process to costing a single thread.

Tested on Linux and OS X with positive timeout verification.

Greg: this also comments out one of the curses cleanup calls into keyboard handling. I found that always hung the test shutdown on the curses runner. (Attached with a python debugger to verify).

Diff Detail

Event Timeline

tfiala updated this revision to Diff 35588.Sep 23 2015, 11:08 PM

tfiala retitled this revision from to test runner: switch to pure-Python timeout mechanism.

tfiala updated this object.

tfiala added reviewers: zturner, clayborg, labath.

tfiala added a subscriber: lldb-commits.

tfiala added inline comments.

test/curses_results.py
223 ↗	(On Diff #35588)	Greg, let me know if there's something more sensible I can do here rather than comment it out. It never seemed to return and just kept looping.

I don't want to stand in the way of progress (and I do think that getting rid of the timeout dependency is progress), but this implementation regresses in a couple of features compared to using timeout:

timeout tries (with moderate success) to cleanup children spawned by the main process. your implementation will (afaik) kill only the main process. This is especially important for build bots, since leaking processes will starve the bot's resources after a while (and we have had problems with this on our darwin build bot).
we intentionally had timeout terminate the child processes with SIGQUIT, because this produces core files of the terminated processes. I have found this very useful when diagnosing the sources of hanging tests. I'd like to have the possibility to do this, even if it will not be enabled by default.

Do you think that the benefits of this change outweigh these issues? If so, and they don't cause our bots to break, then we can put it in (and i'll probably implement the code dumping feature when I find some time), but I though you should be aware of these downsides.

If I understand correctly, the process hierarchy used to look like this:

python - dotest.py

__ python - multiprocessing fork

__ python - lldbtest

__ inferior executable

And now looks like this:

python - dotest.py

|__ python - lldbtest
      |__ inferior executable

In either case, the process.terminate() runs in the python - lldbtest
process, and terminates the inferior. So I don't see how the behavior is
any different than before. Do you mean with gtimeout it's creating a
separate process group for the inferior so that if the inferior itself
spawns children, those will get cleaned up too? Do any inferiors actually
do that? And if so, we could always use the same logic from python of
creating the inferior in a separate process group

In D13124#252564, @labath wrote:

I don't want to stand in the way of progress (and I do think that getting rid of the timeout dependency is progress), but this implementation regresses in a couple of features compared to using timeout:

timeout tries (with moderate success) to cleanup children spawned by the main process. your implementation will (afaik) kill only the main process. This is especially important for build bots, since leaking processes will starve the bot's resources after a while (and we have had problems with this on our darwin build bot).

Fair enough. Either yesterday or the day before, I put up a change that ensures the dotest inferiors are in a separate process group. I could (on non-Windows) send that process group a SIGQUIT. The reason I prefer not to use SIGQUIT is that it is catchable, and thus can be ignored. Once something can be ignored, the runner needs to be able to handle the timed out child really not ending with a SIGQUIT. That means either more logic around timing out on the "wait for death", or possibly not reaping. In the case of a test runner, I'd probably put the focus on making sure the test runner makes it through ahead of getting the coredump from a process that times out.

That said, my initial implementation of this did do the following:

Start with a SIGQUIT (via subprocess.Process.terminate(), so does something reasonable on Windows). Wait for 5 seconds to see if it wraps up (by way of waiting for the communicate() thread to die).
If still not dead, send a SIGKILL (via subprocess.Process.kill()).

I thought the added complication wasn't worth it, but it's not that complicated. And would not regress behavior on the Linux side. The only diff would be, on non-Windows, send a kill to the process group (which has already been created for the inferior dotest), and just use the terminate() call on Windows.

I can give that a whirl. It's a bit more code but is not going to be too bad.

we intentionally had timeout terminate the child processes with SIGQUIT, because this produces core files of the terminated processes. I have found this very useful when diagnosing the sources of hanging tests. I'd like to have the possibility to do this, even if it will not be enabled by default.

Do you think that the benefits of this change outweigh these issues? If so, and they don't cause our bots to break, then we can put it in (and i'll probably implement the code dumping feature when I find some time), but I though you should be aware of these downsides.

I haven't followed the recent changes too closely, but at some point the hierarchy was:

dosep.py
  dosep.py fork
    timeout
      dotest.py (lldb client)
        lldb-server
          inferior
        other executables which were sometimes spawned by the tests

Timeout was making sure none of its children survived after the test timed out. I don't know how it did that, but I presume it ran them in a separate process group. This was not always successful (some of our tests like to create process groups as well), but it was pretty good at compartmenalizing the tests and making sure we leave no processes hanging around.

In D13124#252791, @tfiala wrote:

In D13124#252564, @labath wrote:

I don't want to stand in the way of progress (and I do think that getting rid of the timeout dependency is progress), but this implementation regresses in a couple of features compared to using timeout:

timeout tries (with moderate success) to cleanup children spawned by the main process. your implementation will (afaik) kill only the main process. This is especially important for build bots, since leaking processes will starve the bot's resources after a while (and we have had problems with this on our darwin build bot).

Fair enough. Either yesterday or the day before, I put up a change that ensures the dotest inferiors are in a separate process group. I could (on non-Windows) send that process group a SIGQUIT. The reason I prefer not to use SIGQUIT is that it is catchable, and thus can be ignored. Once something can be ignored, the runner needs to be able to handle the timed out child really not ending with a SIGQUIT. That means either more logic around timing out on the "wait for death", or possibly not reaping. In the case of a test runner, I'd probably put the focus on making sure the test runner makes it through ahead of getting the coredump from a process that times out.

None of the processes we run catch SIGQUIT, so it's mostly safe, but we can do the QUIT+sleep+KILL combo to be safe.

The problem with the race conditions is that they are very hard to reproduce. I often need to run the test suite at full speed dozens of times to be able to catch it happening, and then the only way of diagnosing the issue is digging through the core files.

Thanks for working on this once again :)

In D13124#252784, @zturner wrote:
If I understand correctly, the process hierarchy used to look like this:

python - dotest.py

__ python - multiprocessing fork

__ python - lldbtest

__ inferior executable

And now looks like this:

python - dotest.py
|__ python - lldbtest
      |__ inferior executable
In either case, the process.terminate() runs in the python - lldbtest
process, and terminates the inferior. So I don't see how the behavior is
any different than before. Do you mean with gtimeout it's creating a
separate process group for the inferior so that if the inferior itself
spawns children, those will get cleaned up too?

The man page for the timeout (this one here: http://linux.die.net/man/1/timeout) doesn't explicitly call out that it creates a process group, but that wouldn't at all be out of character for a command like that. Interestingly, it does call out the issue that a using a catchable signal means it can't guarantee that it really shuts down that app if it handles the signal and essentially ignores it.

do that? And if so, we could always use the same logic from python of
creating the inferior in a separate process group

We have that logic in for non-Windows. That's the change I sent you a ping on to see if you wanted to add the Windows flag for creating a separate process group. Since I no longer recall the semantics of that on the Windows side, I wasn't sure if that was useful to you.

I think sending the signal to the process group (since we have one) is probably the right way to go here. But I also think we get into the realm of "did we really kill it" if we start playing around with catchable signals. Also, there is going to be a divergence on how that works on Unix-like vs. Windows. The process group kill command is os.killpg(), which is listed as Unix-only.

I'll come up with the alternative and we can see where we go from there.

Thanks for working on this once again :)

Sure thing!

I'm ok with leaving some straggling processes on Windows for now, because
it's obviously better than what we have currently, which is nothing. I can
implement some sort of helper module at some point that exposes a new
createProcess function that supports this for Windows and Unix with a
single interface once it becomes a problem.

Which reminds me of another thing: Since you're working heavily on the tet
suite, one thing I've always wanted is what I just mentioned: Something
like an lldbxplat module which is essentially a helper module that exposes
a single interface for doing things that differ depending on the platform.
This would essentially let us remove almost every single occurrence of `if
sys.name.startswith('win32')` from the entire test suite, which would
really help maintainability.

It's something I'll get to myself one day if nobody else beats me, but
since you're making a lot of great improvements to the test suite recently,
figured I'd throw it out there.

Which reminds me of another thing: Since you're working heavily on the tet
suite, one thing I've always wanted is what I just mentioned: Something
like an lldbxplat module which is essentially a helper module that exposes
a single interface for doing things that differ depending on the platform.
This would essentially let us remove almost every single occurrence of `if
sys.name.startswith('win32')` from the entire test suite, which would
really help maintainability.

That's a good idea. I'm (somewhat slowly) attempting to work in some clean up each time I touch it. I'll hit that at some point. I'll start winding down on changes to the test infrastructure soon-ish. If I don't hit that by the time I stop heavy changes, definitely feel free to jump on it!

I do intend to get back to this after resolving the cmake build issue on OS X. Hopefully by tonight.

test/curses_results.py
223 ↗	(On Diff #35588)	Greg got rid of this call this morning. There was a 'q' command intended to be there that wasn't yet, which prevented us from exiting at the end. For now we're just taking this out.

That's a good idea. I'm (somewhat slowly) attempting to work in some clean up each time I touch it. I'll hit that at some point. I'll start winding down on changes to the test infrastructure soon-ish. If I don't hit that by the time I stop heavy changes, definitely feel free to jump on it!

Okay, I couldn't help myself. I'm taking a bit more time on this as I set it up much more nicely. I'm pulling all the process management into another module and breaking out some platform-specific bits into a per-platform helper.

I won't have something up until tomorrow later on at the earliest.

+100, great :) Once it's in it will be much more easy to press for others
to move their platform specific bits into this module, or to do it myself
when I'm writing platform specific stuff.

In D13124#253476, @zturner wrote:

+100, great :)

:-)

Work in progess.

This patch is working and timing out consistently on Linux (on a Jenkins bot), and is running normally on OS X. I haven't coerced a timeout on OS X yet.

I'll be adding some tests for the timeout and core generation, and plan to add a --no-core-for-timeout option since I don't think that is universally desirable. (Right now it always attempts to generates cores via SIGQUIT on Posix-like systems, but the code is there to do a SIGTERM instead if no cores are desired. I just haven't hooked it up).

Zachary, you can see what I've got in place on the Windows side. If you know some of that is wrong, feel free to let me now ahead of time and I can change it.

Hope to get those tests in tonight and have a formal "eval this patch" setup.

tfiala added inline comments.Sep 25 2015, 6:30 PM

test/dosep.py
245	This message should include the filename.
test/process_control.py
273 ↗	(On Diff #35788)	This will want to store the popen_process.using_process_groups property like I do on the Posix side if the terminate would do something different based on the presence or absence of using process groups. Oversight for not storing.

I won't be able to have a serious look until Monday, as I'm still remote.
Hopefully you arent working on weekends :)

In D13124#254082, @zturner wrote:

I won't be able to have a serious look until Monday, as I'm still remote.

Oh no worries.

Hopefully you arent working on weekends :)

:-P

Minor refactor, moving new components to test/test_runner/lib.

Will be adding test runner tests to test/test_runner/test and will make sure the normal test runner doesn't try to run them. (These will be test infrastructure tests, i.e. tests for the test infrastructure, not tests that run on them).

This is vs. llvm.org lldb trunk r248676.

Note I've also now had positive timeout confirmation on OS X. As I suspected, it is less convenient there to have the soft kill do a SIGQUIT, since on OS X this brings up the crash log generator dialog where now user input is required (and very bad if the tests are running in a session that doesn't have access to a window server). I'll be working in a --no-core-on-timeout and --core-on-timeout option, and have it default to "no" on OS X and "yes" everywhere else. When no cores are desired, we'll generate a SIGTERM instead of a SIGQUIT on the soft terminate request.

When no cores are desired, we'll generate a SIGTERM instead of a SIGQUIT on the soft terminate request.

That on the POSIX-y systems, of course. It would be cool if Zachary can work in the mini crashdumps or similar on Windows.

Tests added. Ready for review.

The change now has two levels of terminate:

soft terminate, which uses a signal or process control mechanism to tell the process to end. It optionally can use a mechanism that triggers a core dump / crashlog / minidump if supported by the underlying platform. RIght now, only Linux of the Posix-y platforms requests the core (via SIGQUIT). The others use SIGTERM. I will plumb through an option for that later (as mentioned in prior comments in this review). But I'm now at max time I can spend on this right now and this'll need to do. There is also a timeout on how long the process driver will allow the soft timeout to take to wrap up. This is relevant for core dumps that might have a huge amount of memory to dump to disk. It defaults to 10 seconds, but can be overridden by an environment variable, LLDB_TEST_SOFT_TERMINATE_TIMEOUT. That env var will be converted to a float from whatever text is in it. It represents the number of seconds within which any soft terminate option must wrap up. If a process is actively blocking/ignoring that termination signal, then this represents the total time that will be "wasted" waiting to figure this out.

hard terminate: after the soft terminate attempt is made, plus the soft terminate timeout, then if the process is still not eliminated, the hard terminate mechanism will be used. On Posix-y systems, this is a SIGKILL.

@zturner, you will want to run the tests in lldb/test/test_runner/test. Do that with:
cd lldb/test/test_runner/test
python process_control_tests.py

In some cases where docs were clear, I did the bit that seemed to make sense on the Windows side. In others, I left it for you to fill in. Feel free to skip some/any tests that don't make sense if, for example, you only end up wanting to support one level of killing on Windows. (We can probably work in a "has soft terminate mechanism" query into the process_control.ProcessHelper class, which has per-platform derived classes, if we want to go that way).

Also note I will start migrating test infrastructure bits into the lldb/test/test_runner/lib directory over time. The two new files from this change are going in there.

I'd like to get this in sooner than later as it resolves numerous hangs on OS X where we don't have a timeout guard. I've tried it on Linux (4, 8 and 24 core machines) as well as OS X (8 core MBPs), and have run the process_control_tests on them without issue.

Will test this out tomorrow

Looks good to me. :)

If you do end up adding a command line option, I think you can go for --enable-core-on-timeout and default to off for all platforms. I suspect I am the only one using this, and it makes a much safer default.

In D13124#254487, @labath wrote:

Looks good to me. :)

If you do end up adding a command line option, I think you can go for --enable-core-on-timeout and default to off for all platforms. I suspect I am the only one using this, and it makes a much safer default.

Okay, sounds good. I expect some additions that will be needed from Zachary, so if we want to default it off and add a flag, I'll probably get that part in with the initial check-in.

Sorry, our desks were reconfigured over the weekend, so I just now got my
computer turned on. I'm syncing code and hopefully will have a working
build soon.

In D13124#254900, @zturner wrote:

Sorry, our desks were reconfigured over the weekend, so I just now got my
computer turned on. I'm syncing code and hopefully will have a working
build soon.

Sounds good. I expect you'll find a few things you'll want to add/adjust. If you get those patches to me, I'll include them in the check-in.

In D13124#254935, @tfiala wrote:

In D13124#254900, @zturner wrote:

Sorry, our desks were reconfigured over the weekend, so I just now got my
computer turned on. I'm syncing code and hopefully will have a working
build soon.

Sounds good. I expect you'll find a few things you'll want to add/adjust. If you get those patches to me, I'll include them in the check-in.

Also note the test code for this check-in assumes there is a python in the path when it builds the command to run the inferior test subject. (This is the thing we run against to verify that the ProcessDriver --- aka child process runner with timeout support --- gets return codes, witnesses timeouts, witnesses children processes that choose to ignore soft terminate signals and gets the bad guy with a more lethal (hard) termination option, etc.). You might need to tweak that in the test runner. That python invocation doesn't require any special lldb support - it is just testing that a process can be launched, and that process happens to be a python interpreter session running the inferior.py test subject. So I figured that was probably fine as is.

Can you rebase against ToT? I'm having trouble applying the patch.

Sorry, ignore me. My brain is just off, it's working

In D13124#254950, @zturner wrote:

Sorry, ignore me. My brain is just off, it's working

Okay, cool. Yeah I can update it if needed, I think I saw a 1-line dosep.py go in between my last update late last night and early this morning. But it sounds like you made it past there.

This is what I get.

d:\src\llvm\tools\lldb\test>cd test_runner

d:\src\llvm\tools\lldb\test\test_runner>cd test

d:\src\llvm\tools\lldb\test\test_runner\test>c:\Python27_LLDB\x86\python_d.exe
process_control_tests.py

..E.EE

ERROR: test_hard_timeout_works (__main__.ProcessControlTimeoutTests)

Driver falls back to hard terminate when soft terminate is ignored.

Traceback (most recent call last):

File "process_control_tests.py", line 177, in test_hard_timeout_works
  options="--sleep 120"),
File "process_control_tests.py", line 83, in inferior_command
  cls._suppress_soft_terminate(command)
File "process_control_tests.py", line 70, in _suppress_soft_terminate
  for signum in helper.soft_terminate_signals():

TypeError: 'NoneType' object is not iterable

ERROR: test_soft_timeout_works_core (__main__.ProcessControlTimeoutTests)

Driver uses soft terminate (with core request) when process times out.

Traceback (most recent call last):

File "process_control_tests.py", line 157, in test_soft_timeout_works_core
  self._soft_timeout_works(True)
File "process_control_tests.py", line 150, in _soft_timeout_works
  helper.was_soft_terminate(driver.returncode, with_core),
File "..\lib\process_control.py", line 192, in was_soft_terminate
  raise Exception("platform needs to implement")

Exception: platform needs to implement

ERROR: test_soft_timeout_works_no_core (__main__.ProcessControlTimeoutTests)

Driver uses soft terminate (no core request) when process times out.

Traceback (most recent call last):

File "process_control_tests.py", line 162, in

test_soft_timeout_works_no_core

  self._soft_timeout_works(False)
File "process_control_tests.py", line 150, in _soft_timeout_works
  helper.was_soft_terminate(driver.returncode, with_core),
File "..\lib\process_control.py", line 192, in was_soft_terminate
  raise Exception("platform needs to implement")

Exception: platform needs to implement

Ran 6 tests in 10.351s

FAILED (errors=3)
[37558 refs]

Looking into these now.

In D13124#254958, @zturner wrote:
This is what I get.

d:\src\llvm\tools\lldb\test>cd test_runner

d:\src\llvm\tools\lldb\test\test_runner>cd test

d:\src\llvm\tools\lldb\test\test_runner\test>c:\Python27_LLDB\x86\python_d.exe
process_control_tests.py

..E.EE

ERROR: test_hard_timeout_works (__main__.ProcessControlTimeoutTests)

Driver falls back to hard terminate when soft terminate is ignored.

Traceback (most recent call last):
File "process_control_tests.py", line 177, in test_hard_timeout_works
  options="--sleep 120"),
File "process_control_tests.py", line 83, in inferior_command
  cls._suppress_soft_terminate(command)
File "process_control_tests.py", line 70, in _suppress_soft_terminate
  for signum in helper.soft_terminate_signals():
TypeError: 'NoneType' object is not iterable

This is going to be what appears to be a missing "is not None" check on the helper.soft_terminate_signals(). I'm expecting we'll need to figure out how Windows will want to test the "child process ignores the pleasant/nice kill attempt -- aka a soft terminate". I doubt the right thing to do is to tell the inferior to ignore signal numbers on Windows. In this case, here:

File "process_control_tests.py", line 83, in inferior_command

cls._suppress_soft_terminate(command)

we'll need to figure out what Windows needs to do to tell the inferior.py script how to ignore soft terminate signals.

One option I almost did but wanted to figure out if you need it is to switch the inferior.py parameter from the sense of "ignore these signals" to "do whatever platform thing is necessary to avoid a soft terminate". For the Posix-y systems, that would be ignoring SIGTERM and SIGQUIT. For Windows, it would be whatever you need. Then the test side can just add that (new) flag that says something like "--ignore-soft-terminate". That is probably the way to go here. You still need to figure out what needs to go in the handler for that in inferior.py, but that would be the general idea.

ERROR: test_soft_timeout_works_core (__main__.ProcessControlTimeoutTests)

Driver uses soft terminate (with core request) when process times out.

Traceback (most recent call last):
File "process_control_tests.py", line 157, in test_soft_timeout_works_core
  self._soft_timeout_works(True)
File "process_control_tests.py", line 150, in _soft_timeout_works
  helper.was_soft_terminate(driver.returncode, with_core),
File "..\lib\process_control.py", line 192, in was_soft_terminate
  raise Exception("platform needs to implement")
Exception: platform needs to implement

ERROR: test_soft_timeout_works_no_core (__main__.ProcessControlTimeoutTests)

Driver uses soft terminate (no core request) when process times out.

Traceback (most recent call last):
File "process_control_tests.py", line 162, in
test_soft_timeout_works_no_core
  self._soft_timeout_works(False)
File "process_control_tests.py", line 150, in _soft_timeout_works
  helper.was_soft_terminate(driver.returncode, with_core),
File "..\lib\process_control.py", line 192, in was_soft_terminate
  raise Exception("platform needs to implement")
Exception: platform needs to implement

This is one of the spots I need you to fill in.

Ran 6 tests in 10.351s

FAILED (errors=3)
[37558 refs]

Looking into these now.

The "was_soft_terminate()" method is looking at the subprocess.Popen-like object's returncode and, judging by that, certifying that it was (or was not) a soft terminate that caused the exit.

If you need the Popen-like object to figure that out (e.g. if you need to look at some of the Windows-extended Popen attributes), we can rearrange this to take the Popen-like object rather than only its returncode. That would be totally fine.

As far as I can tell, the value of the return code is undocumented when you
use Popen.terminate() on Windows. I don't know what that means for the
patch. It's quite a bit more complicated than I anticipated based on the
original thread to lldb-dev. I thought it was just going to call
Popen.terminate and assume that means it was a hard terminate. I still
don't grasp the need for all the complexity.

A few experiments on a throwaway program suggest that using
Popen.terminate() sets returncode to 1, but there doesn't seem to be any
guarantee that this is always the case.

In D13124#255034, @zturner wrote:

As far as I can tell, the value of the return code is undocumented when you
use Popen.terminate() on Windows.

Okay, interesting. I found what looked like a race on Linux and OS X where calling the Popen-object returncode member (the normal way to get the returncode) would flip from the actual result (received after a wait() call, i.e.

popen_object.wait()
self.returncode = popen_object.returncode

...
returncode2 = popen_object.returncode

and find
returncode2 != self.returncode

I switched to using the result of wait() as the official return code.

I don't know what that means for the
patch. It's quite a bit more complicated than I anticipated based on the
original thread to lldb-dev. I thought it was just going to call
Popen.terminate and assume that means it was a hard terminate. I still
don't grasp the need for all the complexity.

It's the need for the core file generation, which requires a soft (i.e. deflectable by the target process) terminate request. Which requires a 2-level kill mechanism, as we need to make sure the process really dies for a real test runner environment that needs to be long-lived.

I can do one more adjustment. I'll make the soft terminate optional per platform. We'll disable it on Windows and only enable it on Posixy stuff. Then on Windows there will only be the hard terminate.

A few experiments on a throwaway program suggest that using
Popen.terminate() sets returncode to 1, but there doesn't seem to be any
guarantee that this is always the case.

What I was doing was writing a value into the subprocess.Popen-like object on the Posix side, to store extra state needed. (e.g. whether we're killing by process group --- if we created a process group --- or if we're just killing a process directly. Those are different operations on the Posix side). For this, you could just write into the popen object that you killed it with a hard terminate. And then read back that value on the verification side to say "oh yes, that was really the mechanism by which it was killed."

That point is moot, though, if there is only one way to kill on Windows. If there is only one way to kill, then one of these should happen:

If we don't care about crashdumps on Windows, which I think I heard you say is true, then we make the soft terminate level optional by platform, and skip it (i.e. indicate it is not available) on Windows.

If we do care about crashdumps on Windows, then we probably need to untangle the conflation of the soft terminate level (where a target process can ignore) from crash dump generation. On Posix, crash dump request is tied to the soft termination signal used. But on Windows, we could allow that to happen on the (one and only) hard terminate level.

I can get that together (assuming option #1) relatively quickly.

I don't think we can drop the complexity of the 2-tier kill level in the driver, though, given the desire to support core dump generation.

I can get that together (assuming option #1) relatively quickly.

I don't think we can drop the complexity of the 2-tier kill level in the driver, though, given the desire to support core dump generation.

I'm not going to touch this until I hear back from you, though, Zachary. Anything I do on this is way beyond the time I have to allocate to it at this point, so I'll have to stub out a dead simple solution (i.e. essentially just about what it was doing before) for Windows if none of this sounds good for you. I have to get OS X to stop hanging.

Yea, on Windows we can support core dump generation externally without
touching the python script, so that's not neded for us.

Ironically, at the system level there is a way to specify the return code
of the process when you terminate it. And Python internally must be using
this API, because it's the only way to do it. They just decided not to
expose that parameter through the subprocess.terminate() method, and
instead they pass their own magic value. If someone is clever enough to
dig through python source code and figure out what value it uses, we could
use that. But I think on Windows it's safe to assume that if you call
process.terminate(), then it was a hard terminate, and we can ignore
everything else.

Does that make sense / is easy to do?

In D13124#255044, @zturner wrote:

Yea, on Windows we can support core dump generation externally without
touching the python script, so that's not neded for us.

Okay, good!

Ironically, at the system level there is a way to specify the return code
of the process when you terminate it. And Python internally must be using
this API, because it's the only way to do it.

Hah, that's interesting.

They just decided not to
expose that parameter through the subprocess.terminate() method, and
instead they pass their own magic value. If someone is clever enough to
dig through python source code and figure out what value it uses, we could
use that. But I think on Windows it's safe to assume that if you call
process.terminate(), then it was a hard terminate, and we can ignore
everything else.

Does that make sense / is easy to do?

Yes, makes sense and yes I can accommodate by just making that soft terminate level optional by platform. I'll get to working on that after lunch and I hope to have a patch up way before I head home today.

A few more comments

test/test_runner/test/process_control_tests.py
64	Change `nt` to `Windows` (unless you're removing this logic as discussed earlier)
81	Change `"python"` to `sys.executable`

Random thought: If you want to generate a core dump, we already have LLDB
attached to the process, so you have an SBProcess. Couldn't you use
Process.Halt, then call whatever method is necessary to have the debugger
create a core, then Process.Kill?

tfiala added inline comments.Sep 28 2015, 2:20 PM

test/test_runner/test/process_control_tests.py
64	Ah okay. I was copying the "nt" from somewhere else in the source code. We might want to grep for that if it is wholesale wrong. I assumed that was leftover from transition over to the NT codebase in W2k time period.
81	Okay. Better.

zturner added inline comments.Sep 28 2015, 2:25 PM

test/test_runner/test/process_control_tests.py
64	It's really confusing. There's `os.name` and `sys.platform()`, which which return different strings. It's possible the one you saw was checking `os.name`, for which `nt` is one of the valid return values. That's another good candidate for the cross-platform portability module, we could just have lldb_platform.os() which returns an enum. I was a little bummed to see the `process_control` module as a subfolder of `test_runner`, because all of the helper-related stuff could be at a higher level usable by anywhere in the test suite. But we can tackle that later.

In D13124#255058, @zturner wrote:

Random thought: If you want to generate a core dump, we already have LLDB
attached to the process, so you have an SBProcess. Couldn't you use
Process.Halt, then call whatever method is necessary to have the debugger
create a core, then Process.Kill?

I'm not sure I follow. Here's the process layout:

dotest.py parallel runner -> 2. worker thread/process -> 3. dotest.py inferior process -> 4. optional inferior test subject used by the lldb test

Item 3 above is the process that needs to be killed and possibly core dumped. It happens to be a process that may have an optional inferior (item #4) being tested by the lldb test code running in #3. But we're trying to kill 3, and all of its children, and have 3 core dump. (We don't care about a core dump for #4).

If we wanted to get a core for #4, that would seem possibly useful, but that's not the core we're looking for.

Did I follow your idea right?

zturner added inline comments.Sep 28 2015, 2:27 PM

test/test_runner/test/process_control_tests.py
64	Oh yea, in addition to `sys.platform` there's also `platform.system`, as you've used here. And even those don't return the same values.

In D13124#255140, @tfiala wrote:

In D13124#255058, @zturner wrote:

Random thought: If you want to generate a core dump, we already have LLDB
attached to the process, so you have an SBProcess. Couldn't you use
Process.Halt, then call whatever method is necessary to have the debugger
create a core, then Process.Kill?

I'm not sure I follow. Here's the process layout:

dotest.py parallel runner -> 2. worker thread/process -> 3. dotest.py inferior process -> 4. optional inferior test subject used by the lldb test

Item 3 above is the process that needs to be killed and possibly core dumped. It happens to be a process that may have an optional inferior (item #4) being tested by the lldb test code running in #3. But we're trying to kill 3, and all of its children, and have 3 core dump. (We don't care about a core dump for #4).

If we wanted to get a core for #4, that would seem possibly useful, but that's not the core we're looking for.

Did I follow your idea right?

Ahh ok I thought it was a core of the inferior. My bad. Anyway that doesn't change anything about what I said earlier about not needing the soft terminate logic on Windows :)

Ah I see I morphed all those together (the platform naming).

... bummed to see them under test_runner...

Heh, that's funny. I was attempting to do a service of decluttering that top level test directory, which to me looks like somebody threw up a bunch of python code :-) I didn't anticipate it would have the opposite effect!

We can tackle that a few ways: make it a package, which accomplishes 99% of what I was trying to do --- getting things out of that top level lldb/test, *while* still allowing tests to pull in some of that functionality if they want. So it becomes something like:

lldb/test/lldb_platform/

__init__.py
...
the_files_.py
...
test/
   the tests for the package

And then whatever is in there could be loaded from an lldb test with:
import lldb_platform.{whatever}
or
from lldb_platform import {whatever}

That seems totally reasonable to me. How does that sound?

In D13124#255142, @zturner wrote:

In D13124#255140, @tfiala wrote:

In D13124#255058, @zturner wrote:

Random thought: If you want to generate a core dump, we already have LLDB
attached to the process, so you have an SBProcess. Couldn't you use
Process.Halt, then call whatever method is necessary to have the debugger
create a core, then Process.Kill?

I'm not sure I follow. Here's the process layout:

dotest.py parallel runner -> 2. worker thread/process -> 3. dotest.py inferior process -> 4. optional inferior test subject used by the lldb test

Item 3 above is the process that needs to be killed and possibly core dumped. It happens to be a process that may have an optional inferior (item #4) being tested by the lldb test code running in #3. But we're trying to kill 3, and all of its children, and have 3 core dump. (We don't care about a core dump for #4).

If we wanted to get a core for #4, that would seem possibly useful, but that's not the core we're looking for.

Did I follow your idea right?

Ahh ok I thought it was a core of the inferior. My bad. Anyway that doesn't change anything about what I said earlier about not needing the soft terminate logic on Windows :)

Okay!

In D13124#255144, @tfiala wrote:
Ah I see I morphed all those together (the platform naming).

... bummed to see them under test_runner...

Heh, that's funny. I was attempting to do a service of decluttering that top level test directory, which to me looks like somebody threw up a bunch of python code :-) I didn't anticipate it would have the opposite effect!

We can tackle that a few ways: make it a package, which accomplishes 99% of what I was trying to do --- getting things out of that top level lldb/test, *while* still allowing tests to pull in some of that functionality if they want. So it becomes something like:

lldb/test/lldb_platform/
__init__.py
...
the_files_.py
...
test/
   the tests for the package
And then whatever is in there could be loaded from an lldb test with:
import lldb_platform.{whatever}
or
from lldb_platform import {whatever}

That seems totally reasonable to me. How does that sound?

If it weren't for the fact that we have all these switches on os.name, sys.platform, and platform.system inside the tests as well as the test running infrastructure, then hiding it would be good. But yea, the test suite itself is still really fragile due to all the conditionals, so it would be great to be able to use this cross platform helpers from anywhere in the test suite. Making it a package sounds good, but I dont' think it needs to be done as part of this CL (or even immediately after) unless you really want to.

If it weren't for the fact that we have all these switches on os.name, sys.platform, and platform.system inside the tests as well as the test running infrastructure, then hiding it would be good. But yea, the test suite itself is still really fragile due to all the conditionals, so it would be great to be able to use this cross platform helpers from anywhere in the test suite. Making it a package sounds good, but I dont' think it needs to be done as part of this CL (or even immediately after) unless you really want to.

Good - I'll hold off on that part until later.

Changes:

soft terminate is now optional by platform, not supported by default, and listed as supported on POSIX-y systems.
run with timeout now uses soft terminate only when supported.
soft terminate tests are now skipped when soft terminate isn't supported by the platform.
other issues called out should be fixed.

This change does not:

add the lldb_platform package. (I'll do that later.)
plumb through the option to make core file generation accessible by the command line. (I'll do that later.) Right now it maintains core file generation requests for Linux and nobody else.

@zturner, can you give this a run?

tfiala added inline comments.Sep 28 2015, 4:23 PM

test/test_runner/lib/process_control.py
553	Drats, this should be self.returncode. I'll be adjusting that. Not germaine for your case, Zachary.
563	Same as above. Should be testing against self.returncode. I saw a few times where this value returned differently across calls. (i.e. started as -{signal-number} after the first wait() call after its death, then called it again in a print statement a few lines later and it returned 0). This would hit that same issue. Granted this code block shouldn't ever get hit, since we don't ever try again after a hard kill attempt.

Fixed up my previous inline comments. I was caching the popen object's returncode returned from wait(), since it was showing unstable characteristics when reading after a successful wait(). But I failed to use it everywhere. Fixed that.

Clearing out all fixed comments.

Hey Zachary,

Have you seen diff 5 or diff 6 on Windows yet? What results did you get? I think I'm just waiting on a clean Windows run at this point.

Thanks!

-Todd

Sorry got sidetracked this morning. I'll check it out now

Great, much appreciated, Zachary!

zturner added inline comments.Sep 29 2015, 12:47 PM

test/test_runner/lib/process_control.py
215	Should this return `False` now that this is not supported on Windows, or do you think this is still ok? It's probably just of theoretical concern.
230	Change this to: It's not actually documented what return code Popen.terminate() passes to TerminateProcess. Experimentation shows that it's always 1, but there's no guarantee that's true. Hopefully this check works well enough to be sufficient. return returncode != 0 With these changes I get the following output: d:\src\llvm\tools\lldb\test\test_runner\test>python process_control_tests.py ....ss ---------------------------------------------------------------------- Ran 6 tests in 1.208s OK (skipped=2) Does that look right to you?

Does that look right to you?

That result looks right - the two soft terminate tests get skipped when it's not reported.

If you ran the lldb test suite with that setup, they should work as well as they did before the change. (If they don't, I'm missing a lower level test).

I'll fix up those other parts you called out and get what is hopefully the final patch up.

Unless you make significant other changes, feel free to just commit. I
don't have any other concerns. Thanks for working on this!

tfiala added inline comments.Sep 29 2015, 12:57 PM

test/test_runner/lib/process_control.py
215	Correct - this is not valid if soft terminate isn't supported. It should return false if soft terminate is not supported, and raise if it is supported but not overriddden, I think.
230	It's not actually documented what return code Popen.terminate() passes to TerminateProcess. Experimentation shows that it's always This is checking the returncode as returned by wait() on the process. So we're checking the return value of popen_object.wait() here (which should be identical to the popen_object.returncode() immediately after wait() succeeds). I think for the Windows case, we override this in the WindowsProcessHelper and have that return "True". (All kills will be hard terminate in this case). This is really only used by the test cases right now. Alternatively, we can change it to pass in the popen object, and have us write in a "did a hard kill on this" member variable on the Windows implementation. Then, we just return "True" if that is set, False otherwise. For Windows, though, I can put in the returncode != 0. If the hard kill test works with that, it is fine. But it will also pass if a process returns a non-zero exit code I think (i.e. return from main() is non-zero).

Fixes for Windows per previous comments, for posterity's sake.

This is the patch I will be committing.

Accepting with given changes.

This revision is now accepted and ready to land.Sep 29 2015, 3:21 PM

Committed here:

svn commit
Sending        test/dosep.py
Adding         test/test_runner
Adding         test/test_runner/README.txt
Adding         test/test_runner/lib
Adding         test/test_runner/lib/lldb_utils.py
Adding         test/test_runner/lib/process_control.py
Adding         test/test_runner/test
Adding         test/test_runner/test/inferior.py
Adding         test/test_runner/test/process_control_tests.py
Transmitting file data ......
Committed revision 248834.

Revision Contents

Path

Size

test/

dosep.py

184 lines

test_runner/

README.txt

5 lines

lib/

lldb_utils.py

66 lines

process_control.py

607 lines

test/

inferior.py

110 lines

process_control_tests.py

202 lines

Diff 36043

test/dosep.py

Context not available.
	echo core.%p \| sudo tee /proc/sys/kernel/core_pattern	echo core.%p \| sudo tee /proc/sys/kernel/core_pattern
	"""	"""

		# system packages and modules
	import asyncore	import asyncore
	import distutils.version	import distutils.version
	import fnmatch	import fnmatch
Context not available.
	import Queue	import Queue
	import re	import re
	import signal	import signal
	import subprocess
	import sys	import sys
	import threading	import threading
	import test_results
	import dotest_channels
	import dotest_args

		# Add our local test_runner/lib dir to the python path.
		sys.path.append(os.path.join(os.path.dirname(__file__), "test_runner", "lib"))

	def get_timeout_command():	# Our packages and modules
	"""Search for a suitable timeout command."""	import dotest_channels
	if not sys.platform.startswith("win32"):	import dotest_args
	try:	import lldb_utils
	subprocess.call("timeout", stderr=subprocess.PIPE)	import process_control
	return "timeout"
	except OSError:
	pass
	try:
	subprocess.call("gtimeout", stderr=subprocess.PIPE)
	return "gtimeout"
	except OSError:
	pass
	return None

	timeout_command = get_timeout_command()

	# Status codes for running command with timeout.	# Status codes for running command with timeout.
	eTimedOut, ePassed, eFailed = 124, 0, 1	eTimedOut, ePassed, eFailed = 124, 0, 1
Context not available.
	unexpected_successes = unexpected_successes + int(unexpected_success_count.group(1))	unexpected_successes = unexpected_successes + int(unexpected_success_count.group(1))
	if error_count is not None:	if error_count is not None:
	failures = failures + int(error_count.group(1))	failures = failures + int(error_count.group(1))
	pass
	return passes, failures, unexpected_successes	return passes, failures, unexpected_successes


	def create_new_process_group():	class DoTestProcessDriver(process_control.ProcessDriver):
	"""Creates a new process group for the process."""	"""Drives the dotest.py inferior process and handles bookkeeping."""
	os.setpgid(os.getpid(), os.getpid())	def __init__(self, output_file, output_file_lock, pid_events, file_name,
		soft_terminate_timeout):
		super(DoTestProcessDriver, self).__init__(
		soft_terminate_timeout=soft_terminate_timeout)
		self.output_file = output_file
		self.output_lock = lldb_utils.OptionalWith(output_file_lock)
		self.pid_events = pid_events
		self.results = None
		self.file_name = file_name

		def write(self, content):
		with self.output_lock:
		self.output_file.write(content)

		def on_process_started(self):
		if self.pid_events:
		self.pid_events.put_nowait(('created', self.process.pid))

		def on_process_exited(self, command, output, was_timeout, exit_status):
		if self.pid_events:
		# No point in culling out those with no exit_status (i.e.
		# those we failed to kill). That would just cause
		# downstream code to try to kill it later on a Ctrl-C. At
		# this point, a best-effort-to-kill already took place. So
		# call it destroyed here.
		self.pid_events.put_nowait(('destroyed', self.process.pid))

		# Override the exit status if it was a timeout.
		if was_timeout:
		exit_status = eTimedOut

		# If we didn't end up with any output, call it empty for
		# stdout/stderr.
		if output is None:
		output = ('', '')

		# Now parse the output.
		passes, failures, unexpected_successes = parse_test_results(output)
		if exit_status == 0:
		# stdout does not have any useful information from 'dotest.py',
		# only stderr does.
		report_test_pass(self.file_name, output[1])
		else:
		report_test_failure(self.file_name, command, output[1])

		# Save off the results for the caller.
		self.results = (
		self.file_name,
		exit_status,
		passes,
		failures,
		unexpected_successes)

	def call_with_timeout(command, timeout, name, inferior_pid_events):
	"""Run command with a timeout if possible.
	-s QUIT will create a coredump if they are enabled on your system
	"""
	process = None
	if timeout_command and timeout != "0":
	command = [timeout_command, '-s', 'QUIT', timeout] + command

		def get_soft_terminate_timeout():
		# Defaults to 10 seconds, but can set
		# LLDB_TEST_SOFT_TERMINATE_TIMEOUT to a floating point
		# number in seconds. This value indicates how long
		# the test runner will wait for the dotest inferior to
		# handle a timeout via a soft terminate before it will
		# assume that failed and do a hard terminate.

		# TODO plumb through command-line option
		return float(os.environ.get('LLDB_TEST_SOFT_TERMINATE_TIMEOUT', 10.0))


		def want_core_on_soft_terminate():
		# TODO plumb through command-line option
		if platform.system() == 'Linux':
		return True
		else:
		return False


		def call_with_timeout(command, timeout, name, inferior_pid_events):
		# Add our worker index (if we have one) to all test events
		# from this inferior.
	if GET_WORKER_INDEX is not None:	if GET_WORKER_INDEX is not None:
	try:	try:
		tfialaAuthorUnsubmitted Not Done Reply Inline Actions This message should include the filename. tfiala: This message should include the filename.
	worker_index = GET_WORKER_INDEX()	worker_index = GET_WORKER_INDEX()
	command.extend([	command.extend([
	"--event-add-entries", "worker_index={}:int".format(worker_index)])	"--event-add-entries",
	except:	"worker_index={}:int".format(worker_index)])
	# Ctrl-C does bad things to multiprocessing.Manager.dict() lookup.	except: # pylint: disable=bare-except
		# Ctrl-C does bad things to multiprocessing.Manager.dict()
		# lookup. Just swallow it.
	pass	pass

	# Specifying a value for close_fds is unsupported on Windows when using	# Create the inferior dotest.py ProcessDriver.
	# subprocess.PIPE	soft_terminate_timeout = get_soft_terminate_timeout()
	if os.name != "nt":	want_core = want_core_on_soft_terminate()
	process = subprocess.Popen(command,
	stdin=subprocess.PIPE,	process_driver = DoTestProcessDriver(
	stdout=subprocess.PIPE,	sys.stdout,
	stderr=subprocess.PIPE,	output_lock,
	close_fds=True,	inferior_pid_events,
	preexec_fn=create_new_process_group)	name,
	else:	soft_terminate_timeout)
	process = subprocess.Popen(command,
	stdin=subprocess.PIPE,	# Run it with a timeout.
	stdout=subprocess.PIPE,	process_driver.run_command_with_timeout(command, timeout, want_core)
	stderr=subprocess.PIPE)
	inferior_pid = process.pid	# Return the results.
	if inferior_pid_events:	if not process_driver.results:
	inferior_pid_events.put_nowait(('created', inferior_pid))	# This is truly exceptional. Even a failing or timed out
	output = process.communicate()	# binary should have called the results-generation code.
		raise Exception("no test results were generated whatsoever")
	# The inferior should now be entirely wrapped up.	return process_driver.results
	exit_status = process.returncode
	if exit_status is None:
	raise Exception(
	"no exit status available after the inferior dotest.py "
	"should have completed")

	if inferior_pid_events:
	inferior_pid_events.put_nowait(('destroyed', inferior_pid))

	passes, failures, unexpected_successes = parse_test_results(output)
	if exit_status == 0:
	# stdout does not have any useful information from 'dotest.py',
	# only stderr does.
	report_test_pass(name, output[1])
	else:
	# TODO need to differentiate a failing test from a run that
	# was broken out of by a SIGTERM/SIGKILL, reporting those as
	# an error. If a signal-based completion, need to call that
	# an error.
	report_test_failure(name, command, output[1])
	return name, exit_status, passes, failures, unexpected_successes


	def process_dir(root, files, test_root, dotest_argv, inferior_pid_events):	def process_dir(root, files, test_root, dotest_argv, inferior_pid_events):
Context not available.

test/test_runner/README.txt

This file was added.

				This directory contains source and tests for the lldb test runner
				architecture. This directory is not for lldb python tests. It
				is the test runner. The tests under this diretory are test-runner
				tests (i.e. tests that verify the test runner itself runs properly).

test/test_runner/lib/lldb_utils.py

This file was added.

				"""
				The LLVM Compiler Infrastructure

				This file is distributed under the University of Illinois Open Source
				License. See LICENSE.TXT for details.

				Provides classes used by the test results reporting infrastructure
				within the LLDB test suite.


				This module contains utilities used by the lldb test framwork.
				"""


				class OptionalWith(object):
				# pylint: disable=too-few-public-methods
				# This is a wrapper - it is not meant to provide any extra methods.
				"""Provides a wrapper for objects supporting "with", allowing None.

				This lets a user use the "with object" syntax for resource usage
				(e.g. locks) even when the wrapped with object is None.

				e.g.

				wrapped_lock = OptionalWith(thread.Lock())
				with wrapped_lock:
				# Do something while the lock is obtained.
				pass

				might_be_none = None
				wrapped_none = OptionalWith(might_be_none)
				with wrapped_none:
				# This code here still works.
				pass

				This prevents having to write code like this when
				a lock is optional:

				if lock:
				lock.acquire()

				try:
				code_fragament_always_run()
				finally:
				if lock:
				lock.release()

				And I'd posit it is safer, as it becomes impossible to
				forget the try/finally using OptionalWith(), since
				the with syntax can be used.
				"""
				def __init__(self, wrapped_object):
				self.wrapped_object = wrapped_object

				def __enter__(self):
				if self.wrapped_object is not None:
				return self.wrapped_object.__enter__()
				else:
				return self

				def __exit__(self, the_type, value, traceback):
				if self.wrapped_object is not None:
				return self.wrapped_object.__exit__(the_type, value, traceback)
				else:
				# Don't suppress any exceptions
				return False

test/test_runner/lib/process_control.py

This file was added.

				"""
				The LLVM Compiler Infrastructure

				This file is distributed under the University of Illinois Open Source
				License. See LICENSE.TXT for details.

				Provides classes used by the test results reporting infrastructure
				within the LLDB test suite.


				This module provides process-management support for the LLDB test
				running infrasructure.
				"""

				# System imports
				import os
				import re
				import signal
				import subprocess
				import sys
				import threading


				class CommunicatorThread(threading.Thread):
				"""Provides a thread class that communicates with a subprocess."""
				def __init__(self, process, event, output_file):
				super(CommunicatorThread, self).__init__()
				# Don't let this thread prevent shutdown.
				self.daemon = True
				self.process = process
				self.pid = process.pid
				self.event = event
				self.output_file = output_file
				self.output = None

				def run(self):
				try:
				# Communicate with the child process.
				# This will not complete until the child process terminates.
				self.output = self.process.communicate()
				except Exception as exception: # pylint: disable=broad-except
				if self.output_file:
				self.output_file.write(
				"exception while using communicate() for pid: {}\n".format(
				exception))
				finally:
				# Signal that the thread's run is complete.
				self.event.set()


				# Provides a regular expression for matching gtimeout-based durations.
				TIMEOUT_REGEX = re.compile(r"(^\d+)([smhd])?$")


				def timeout_to_seconds(timeout):
				"""Converts timeout/gtimeout timeout values into seconds.

				@param timeout a timeout in the form of xm representing x minutes.

				@return None if timeout is None, or the number of seconds as a float
				if a valid timeout format was specified.
				"""
				if timeout is None:
				return None
				else:
				match = TIMEOUT_REGEX.match(timeout)
				if match:
				value = float(match.group(1))
				units = match.group(2)
				if units is None:
				# default is seconds. No conversion necessary.
				return value
				elif units == 's':
				# Seconds. No conversion necessary.
				return value
				elif units == 'm':
				# Value is in minutes.
				return 60.0 * value
				elif units == 'h':
				# Value is in hours.
				return (60.0 * 60.0) * value
				elif units == 'd':
				# Value is in days.
				return 24 * (60.0 * 60.0) * value
				else:
				raise Exception("unexpected units value '{}'".format(units))
				else:
				raise Exception("could not parse TIMEOUT spec '{}'".format(
				timeout))


				class ProcessHelper(object):
				"""Provides an interface for accessing process-related functionality.

				This class provides a factory method that gives the caller a
				platform-specific implementation instance of the class.

				Clients of the class should stick to the methods provided in this
				base class.

				@see ProcessHelper.process_helper()
				"""
				def __init__(self):
				super(ProcessHelper, self).__init__()

				@classmethod
				def process_helper(cls):
				"""Returns a platform-specific ProcessHelper instance.
				@return a ProcessHelper instance that does the right thing for
				the current platform.
				"""

				# If you add a new platform, create an instance here and
				# return it.
				if os.name == "nt":
				return WindowsProcessHelper()
				else:
				# For all POSIX-like systems.
				return UnixProcessHelper()

				def create_piped_process(self, command, new_process_group=True):
				# pylint: disable=no-self-use,unused-argument
				# As expected. We want derived classes to implement this.
				"""Creates a subprocess.Popen-based class with I/O piped to the parent.

				@param command the command line list as would be passed to
				subprocess.Popen(). Use the list form rather than the string form.

				@param new_process_group indicates if the caller wants the
				process to be created in its own process group. Each OS handles
				this concept differently. It provides a level of isolation and
				can simplify or enable terminating the process tree properly.

				@return a subprocess.Popen-like object.
				"""
				raise Exception("derived class must implement")

				def supports_soft_terminate(self):
				# pylint: disable=no-self-use
				# As expected. We want derived classes to implement this.
				"""Indicates if the platform supports soft termination.

				Soft termination is the concept of a terminate mechanism that
				allows the target process to shut down nicely, but with the
				catch that the process might choose to ignore it.

				Platform supporter note: only mark soft terminate as supported
				if the target process has some way to evade the soft terminate
				request; otherwise, just support the hard terminate method.

				@return True if the platform supports a soft terminate mechanism.
				"""
				# By default, we do not support a soft terminate mechanism.
				return False

				def soft_terminate(self, popen_process, log_file=None, want_core=True):
				# pylint: disable=no-self-use,unused-argument
				# As expected. We want derived classes to implement this.
				"""Attempts to terminate the process in a polite way.

				This terminate method is intended to give the child process a
				chance to clean up and exit on its own, possibly with a request
				to drop a core file or equivalent (i.e. [mini-]crashdump, crashlog,
				etc.) If new_process_group was set in the process creation method
				and the platform supports it, this terminate call will attempt to
				kill the whole process tree rooted in this child process.

				@param popen_process the subprocess.Popen-like object returned
				by one of the process-creation methods of this class.

				@param log_file file-like object used to emit error-related
				logging info. May be None if no error-related info is desired.

				@param want_core True if the caller would like to get a core
				dump (or the analogous crash report) from the terminated process.
				"""
				popen_process.terminate()

				def hard_terminate(self, popen_process, log_file=None):
				# pylint: disable=no-self-use,unused-argument
				# As expected. We want derived classes to implement this.
				"""Attempts to terminate the process immediately.

				This terminate method is intended to kill child process in
				a manner in which the child process has no ability to block,
				and also has no ability to clean up properly. If new_process_group
				was specified when creating the process, and if the platform
				implementation supports it, this will attempt to kill the
				whole process tree rooted in the child process.

				@param popen_process the subprocess.Popen-like object returned
				by one of the process-creation methods of this class.

				@param log_file file-like object used to emit error-related
				logging info. May be None if no error-related info is desired.
				"""
				popen_process.kill()

				def was_soft_terminate(self, returncode, with_core):
				# pylint: disable=no-self-use,unused-argument
				# As expected. We want derived classes to implement this.
				"""Returns if Popen-like object returncode matches soft terminate.

				@param returncode the returncode from the Popen-like object that
				terminated with a given return code.

				@param with_core indicates whether the returncode should match
				a core-generating return signal.

				@return True when the returncode represents what the system would
				issue when a soft_terminate() with the given with_core arg occurred;
				False otherwise.
				"""
				if not self.supports_soft_terminate():
				# If we don't support soft termination on this platform,
				zturnerUnsubmitted Not Done Reply Inline Actions Should this return `False` now that this is not supported on Windows, or do you think this is still ok? It's probably just of theoretical concern. zturner: Should this return `False` now that this is not supported on Windows, or do you think this is…
				tfialaAuthorUnsubmitted Not Done Reply Inline Actions Correct - this is not valid if soft terminate isn't supported. It should return false if soft terminate is not supported, and raise if it is supported but not overriddden, I think. tfiala: Correct - this is not valid if soft terminate isn't supported. It should return false if soft…
				# then this should always be False.
				return False
				else:
				# Once a platform claims to support soft terminate, it
				# needs to be able to identify it by overriding this method.
				raise Exception("platform needs to implement")

				def was_hard_terminate(self, returncode):
				# pylint: disable=no-self-use,unused-argument
				# As expected. We want derived classes to implement this.
				"""Returns if Popen-like object returncode matches that of a hard
				terminate attempt.

				@param returncode the returncode from the Popen-like object that
				terminated with a given return code.
				zturnerUnsubmitted Not Done Reply Inline Actions Change this to: It's not actually documented what return code Popen.terminate() passes to TerminateProcess. Experimentation shows that it's always 1, but there's no guarantee that's true. Hopefully this check works well enough to be sufficient. return returncode != 0 With these changes I get the following output: d:\src\llvm\tools\lldb\test\test_runner\test>python process_control_tests.py ....ss ---------------------------------------------------------------------- Ran 6 tests in 1.208s OK (skipped=2) Does that look right to you? zturner: Change this to: # It's not actually documented what return code Popen.terminate() #…
				tfialaAuthorUnsubmitted Not Done Reply Inline Actions It's not actually documented what return code Popen.terminate() passes to TerminateProcess. Experimentation shows that it's always This is checking the returncode as returned by wait() on the process. So we're checking the return value of popen_object.wait() here (which should be identical to the popen_object.returncode() immediately after wait() succeeds). I think for the Windows case, we override this in the WindowsProcessHelper and have that return "True". (All kills will be hard terminate in this case). This is really only used by the test cases right now. Alternatively, we can change it to pass in the popen object, and have us write in a "did a hard kill on this" member variable on the Windows implementation. Then, we just return "True" if that is set, False otherwise. For Windows, though, I can put in the returncode != 0. If the hard kill test works with that, it is fine. But it will also pass if a process returns a non-zero exit code I think (i.e. return from main() is non-zero). tfiala: > It's not actually documented what return code Popen.terminate() passes to TerminateProcess.

				@return True when the returncode represents what the system would
				issue when a hard_terminate() occurred; False
				otherwise.
				"""
				raise Exception("platform needs to implement")

				def soft_terminate_signals(self):
				# pylint: disable=no-self-use
				"""Retrieve signal numbers that can be sent to soft terminate.
				@return a list of signal numbers that can be sent to soft terminate
				a process, or None if not applicable.
				"""
				return None


				class UnixProcessHelper(ProcessHelper):
				"""Provides a ProcessHelper for Unix-like operating systems.

				This implementation supports anything that looks Posix-y
				(e.g. Darwin, Linux, *BSD, etc.)
				"""
				def __init__(self):
				super(UnixProcessHelper, self).__init__()

				@classmethod
				def _create_new_process_group(cls):
				"""Creates a new process group for the calling process."""
				os.setpgid(os.getpid(), os.getpid())

				def create_piped_process(self, command, new_process_group=True):
				# Determine what to run after the fork but before the exec.
				if new_process_group:
				preexec_func = self._create_new_process_group
				else:
				preexec_func = None

				# Create the process.
				process = subprocess.Popen(
				command,
				stdin=subprocess.PIPE,
				stdout=subprocess.PIPE,
				stderr=subprocess.PIPE,
				close_fds=True,
				preexec_fn=preexec_func)

				# Remember whether we're using process groups for this
				# process.
				process.using_process_groups = new_process_group
				return process

				def supports_soft_terminate(self):
				# POSIX does support a soft terminate via:
				# * SIGTERM (no core requested)
				# * SIGQUIT (core requested if enabled, see ulimit -c)
				return True

				@classmethod
				def _validate_pre_terminate(cls, popen_process, log_file):
				# Validate args.
				if popen_process is None:
				raise ValueError("popen_process is None")

				# Ensure we have something that looks like a valid process.
				if popen_process.pid < 1:
				if log_file:
				log_file.write("skipping soft_terminate(): no process id")
				return False

				# Don't kill if it's already dead.
				popen_process.poll()
				if popen_process.returncode is not None:
				# It has a returncode. It has already stopped.
				if log_file:
				log_file.write(
				"requested to terminate pid {} but it has already "
				"terminated, returncode {}".format(
				popen_process.pid, popen_process.returncode))
				# Move along...
				return False

				# Good to go.
				return True

				def _kill_with_signal(self, popen_process, log_file, signum):
				# Validate we're ready to terminate this.
				if not self._validate_pre_terminate(popen_process, log_file):
				return

				# Choose kill mechanism based on whether we're targeting
				# a process group or just a process.
				if popen_process.using_process_groups:
				# if log_file:
				# log_file.write(
				# "sending signum {} to process group {} now\n".format(
				# signum, popen_process.pid))
				os.killpg(popen_process.pid, signum)
				else:
				# if log_file:
				# log_file.write(
				# "sending signum {} to process {} now\n".format(
				# signum, popen_process.pid))
				os.kill(popen_process.pid, signum)

				def soft_terminate(self, popen_process, log_file=None, want_core=True):
				# Choose signal based on desire for core file.
				if want_core:
				# SIGQUIT will generate core by default. Can be caught.
				signum = signal.SIGQUIT
				else:
				# SIGTERM is the traditional nice way to kill a process.
				# Can be caught, doesn't generate a core.
				signum = signal.SIGTERM

				self._kill_with_signal(popen_process, log_file, signum)

				def hard_terminate(self, popen_process, log_file=None):
				self._kill_with_signal(popen_process, log_file, signal.SIGKILL)

				def was_soft_terminate(self, returncode, with_core):
				if with_core:
				return returncode == -signal.SIGQUIT
				else:
				return returncode == -signal.SIGTERM

				def was_hard_terminate(self, returncode):
				return returncode == -signal.SIGKILL

				def soft_terminate_signals(self):
				return [signal.SIGQUIT, signal.SIGTERM]


				class WindowsProcessHelper(ProcessHelper):
				"""Provides a Windows implementation of the ProcessHelper class."""
				def __init__(self):
				super(WindowsProcessHelper, self).__init__()

				def create_piped_process(self, command, new_process_group=True):
				if new_process_group:
				# We need this flag if we want os.kill() to work on the subprocess.
				creation_flags = subprocess.CREATE_NEW_PROCESS_GROUP
				else:
				creation_flags = 0

				return subprocess.Popen(
				command,
				stdin=subprocess.PIPE,
				stdout=subprocess.PIPE,
				stderr=subprocess.PIPE,
				creationflags=creation_flags)

				def was_hard_terminate(self, returncode):
				return returncode != 0


				class ProcessDriver(object):
				"""Drives a child process, notifies on important events, and can timeout.

				Clients are expected to derive from this class and override the
				on_process_started and on_process_exited methods if they want to
				hook either of those.

				This class supports timing out the child process in a platform-agnostic
				way. The on_process_exited method is informed if the exit was natural
				or if it was due to a timeout.
				"""
				def __init__(self, soft_terminate_timeout=10.0):
				super(ProcessDriver, self).__init__()
				self.process_helper = ProcessHelper.process_helper()
				self.pid = None
				# Create the synchronization event for notifying when the
				# inferior dotest process is complete.
				self.done_event = threading.Event()
				self.io_thread = None
				self.process = None
				# Number of seconds to wait for the soft terminate to
				# wrap up, before moving to more drastic measures.
				# Might want this longer if core dumps are generated and
				# take a long time to write out.
				self.soft_terminate_timeout = soft_terminate_timeout
				# Number of seconds to wait for the hard terminate to
				# wrap up, before giving up on the io thread. This should
				# be fast.
				self.hard_terminate_timeout = 5.0
				self.returncode = None

				# =============================================
				# Methods for subclasses to override if desired.
				# =============================================

				def on_process_started(self):
				pass

				def on_process_exited(self, command, output, was_timeout, exit_status):
				pass

				def write(self, content):
				# pylint: disable=no-self-use
				# Intended - we want derived classes to be able to override
				# this and use any self state they may contain.
				sys.stdout.write(content)

				# ==============================================================
				# Operations used to drive processes. Clients will want to call
				# one of these.
				# ==============================================================

				def run_command(self, command):
				# Start up the child process and the thread that does the
				# communication pump.
				self._start_process_and_io_thread(command)

				# Wait indefinitely for the child process to finish
				# communicating. This indicates it has closed stdout/stderr
				# pipes and is done.
				self.io_thread.join()
				self.returncode = self.process.wait()
				if self.returncode is None:
				raise Exception(
				"no exit status available for pid {} after the "
				" inferior dotest.py should have completed".format(
				self.process.pid))

				# Notify of non-timeout exit.
				self.on_process_exited(
				command,
				self.io_thread.output,
				False,
				self.returncode)

				def run_command_with_timeout(self, command, timeout, want_core):
				# Figure out how many seconds our timeout description is requesting.
				timeout_seconds = timeout_to_seconds(timeout)

				# Start up the child process and the thread that does the
				# communication pump.
				self._start_process_and_io_thread(command)

				self._wait_with_timeout(timeout_seconds, command, want_core)

				# ================
				# Internal details.
				# ================

				def _start_process_and_io_thread(self, command):
				# Create the process.
				self.process = self.process_helper.create_piped_process(command)
				self.pid = self.process.pid
				self.on_process_started()

				# Ensure the event is cleared that is used for signaling
				# from the communication() thread when communication is
				# complete (i.e. the inferior process has finished).
				self.done_event.clear()

				self.io_thread = CommunicatorThread(
				self.process, self.done_event, self.write)
				self.io_thread.start()

				def _attempt_soft_kill(self, want_core):
				# The inferior dotest timed out. Attempt to clean it
				# with a non-drastic method (so it can clean up properly
				# and/or generate a core dump). Often the OS can't guarantee
				# that the process will really terminate after this.
				self.process_helper.soft_terminate(
				self.process,
				want_core=want_core,
				log_file=self)

				# Now wait up to a certain timeout period for the io thread
				# to say that the communication ended. If that wraps up
				# within our soft terminate timeout, we're all done here.
				self.io_thread.join(self.soft_terminate_timeout)
				if not self.io_thread.is_alive():
				# stdout/stderr were closed on the child process side. We
				# should be able to wait and reap the child process here.
				self.returncode = self.process.wait()
				# We terminated, and the done_trying result is n/a
				terminated = True
				done_trying = None
				else:
				self.write("soft kill attempt of process {} timed out "
				"after {} seconds\n".format(
				self.process.pid, self.soft_terminate_timeout))
				terminated = False
				done_trying = False
				return terminated, done_trying

				def _attempt_hard_kill(self):
				# Instruct the process to terminate and really force it to
				# happen. Don't give the process a chance to ignore.
				self.process_helper.hard_terminate(
				self.process,
				log_file=self)

				# Reap the child process. This should not hang as the
				# hard_kill() mechanism is supposed to really kill it.
				# Improvement option:
				# If this does ever hang, convert to a self.process.poll()
				# loop checking on self.process.returncode until it is not
				# None or the timeout occurs.
				self.returncode = self.process.wait()

				# Wait a few moments for the io thread to finish...
				self.io_thread.join(self.hard_terminate_timeout)
				if self.io_thread.is_alive():
				# ... but this is not critical if it doesn't end for some
				# reason.
				self.write(
				"hard kill of process {} timed out after {} seconds waiting "
				"for the io thread (ignoring)\n".format(
				self.process.pid, self.hard_terminate_timeout))

				# Set if it terminated. (Set up for optional improvement above).
				terminated = self.returncode is not None
				# Nothing else to try.
				done_trying = True

				return terminated, done_trying

				def _attempt_termination(self, attempt_count, want_core):
				if self.process_helper.supports_soft_terminate():
				# When soft termination is supported, we first try to stop
				tfialaAuthorUnsubmitted Not Done Reply Inline Actions Drats, this should be self.returncode. I'll be adjusting that. Not germaine for your case, Zachary. tfiala: Drats, this should be self.returncode. I'll be adjusting that. Not germaine for your case…
				# the process with a soft terminate. Failing that, we try
				# the hard terminate option.
				if attempt_count == 1:
				return self._attempt_soft_kill(want_core)
				elif attempt_count == 2:
				return self._attempt_hard_kill()
				else:
				# We don't have anything else to try.
				terminated = self.returncode is not None
				done_trying = True
				tfialaAuthorUnsubmitted Not Done Reply Inline Actions Same as above. Should be testing against self.returncode. I saw a few times where this value returned differently across calls. (i.e. started as -{signal-number} after the first wait() call after its death, then called it again in a print statement a few lines later and it returned 0). This would hit that same issue. Granted this code block shouldn't ever get hit, since we don't ever try again after a hard kill attempt. tfiala: Same as above. Should be testing against self.returncode. I saw a few times where this value…
				return terminated, done_trying
				else:
				# We only try the hard terminate option when there
				# is no soft terminate available.
				if attempt_count == 1:
				return self._attempt_hard_kill()
				else:
				# We don't have anything else to try.
				terminated = self.returncode is not None
				done_trying = True
				return terminated, done_trying

				def _wait_with_timeout(self, timeout_seconds, command, want_core):
				# Allow up to timeout seconds for the io thread to wrap up.
				# If that completes, the child process should be done.
				completed_normally = self.done_event.wait(timeout_seconds)
				if completed_normally:
				# Reap the child process here.
				self.returncode = self.process.wait()
				else:
				# Prepare to stop the process
				process_terminated = completed_normally
				terminate_attempt_count = 0

				# Try as many attempts as we support for trying to shut down
				# the child process if it's not already shut down.
				while not process_terminated:
				terminate_attempt_count += 1
				# Attempt to terminate.
				process_terminated, done_trying = self._attempt_termination(
				terminate_attempt_count, want_core)
				# Check if there's nothing more to try.
				if done_trying:
				# Break out of our termination attempt loop.
				break

				# At this point, we're calling it good. The process
				# finished gracefully, was shut down after one or more
				# attempts, or we failed but gave it our best effort.
				self.on_process_exited(
				command,
				self.io_thread.output,
				not completed_normally,
				self.returncode)

test/test_runner/test/inferior.py

This file was added.

Property	Old Value	New Value
File Mode	null	100755

				#!/usr/bin/env python
				"""Inferior program used by process control tests."""
				import argparse
				import datetime
				import signal
				import sys
				import time


				def parse_args(command_line):
				"""Parses the command line arguments given to it.

				@param command_line a list of command line arguments to be parsed.

				@return the argparse options dictionary.
				"""
				parser = argparse.ArgumentParser()
				parser.add_argument(
				"--ignore-signal",
				"-i",
				dest="ignore_signals",
				metavar="SIGNUM",
				action="append",
				type=int,
				default=[],
				help="ignore the given signal number (if possible)")
				parser.add_argument(
				"--return-code",
				"-r",
				type=int,
				default=0,
				help="specify the return code for the inferior upon exit")
				parser.add_argument(
				"--sleep",
				"-s",
				metavar="SECONDS",
				dest="sleep_seconds",
				type=float,
				help="sleep for SECONDS seconds before returning")
				parser.add_argument(
				"--verbose", "-v", action="store_true",
				help="log verbose operation details to stdout")
				return parser.parse_args(command_line)


				def maybe_ignore_signals(options, signals):
				"""Ignores any signals provided to it.

				@param options the command line options parsed by the program.
				General used to check flags for things like verbosity.

				@param signals the list of signals to ignore. Can be None or zero-length.
				Entries should be type int.
				"""
				if signals is None:
				return

				for signum in signals:
				if options.verbose:
				print "disabling signum {}".format(signum)
				signal.signal(signum, signal.SIG_IGN)


				def maybe_sleep(options, sleep_seconds):
				"""Sleeps the number of seconds specified, restarting as needed.

				@param options the command line options parsed by the program.
				General used to check flags for things like verbosity.

				@param sleep_seconds the number of seconds to sleep. If None
				or <= 0, no sleeping will occur.
				"""
				if sleep_seconds is None:
				return

				if sleep_seconds <= 0:
				return

				end_time = datetime.datetime.now() + datetime.timedelta(0, sleep_seconds)
				if options.verbose:
				print "sleep end time: {}".format(end_time)

				# Do sleep in a loop: signals can interrupt.
				while datetime.datetime.now() < end_time:
				# We'll wrap this in a try/catch so we don't encounter
				# a race if a signal (ignored) knocks us out of this
				# loop and causes us to return.
				try:
				sleep_interval = end_time - datetime.datetime.now()
				sleep_seconds = sleep_interval.total_seconds()
				if sleep_seconds > 0:
				time.sleep(sleep_seconds)
				except:
				pass


				def main(command_line):
				"""Drives the main operation of the inferior test program.

				@param command_line the command line options to process.

				@return the exit value (program return code) for the process.
				"""
				options = parse_args(command_line)
				maybe_ignore_signals(options, options.ignore_signals)
				maybe_sleep(options, options.sleep_seconds)
				return options.return_code

				if __name__ == "__main__":
				sys.exit(main(sys.argv[1:]))

test/test_runner/test/process_control_tests.py

This file was added.

Property	Old Value	New Value
File Mode	null	100755

				#!/usr/bin/env python
				"""
				The LLVM Compiler Infrastructure

				This file is distributed under the University of Illinois Open Source
				License. See LICENSE.TXT for details.

				Provides classes used by the test results reporting infrastructure
				within the LLDB test suite.


				Tests the process_control module.
				"""

				# System imports.
				import os
				import platform
				import unittest
				import sys
				import threading

				# Add lib dir to pythonpath
				sys.path.append(os.path.join(os.path.dirname(__file__), '..', 'lib'))

				# Our imports.
				import process_control


				class TestInferiorDriver(process_control.ProcessDriver):
				def __init__(self, soft_terminate_timeout=None):
				super(TestInferiorDriver, self).__init__(
				soft_terminate_timeout=soft_terminate_timeout)
				self.started_event = threading.Event()
				self.started_event.clear()

				self.completed_event = threading.Event()
				self.completed_event.clear()

				self.was_timeout = False
				self.returncode = None
				self.output = None

				def write(self, content):
				# We'll swallow this to keep tests non-noisy.
				# Uncomment the following line if you want to see it.
				# sys.stdout.write(content)
				pass

				def on_process_started(self):
				self.started_event.set()

				def on_process_exited(self, command, output, was_timeout, exit_status):
				self.returncode = exit_status
				self.was_timeout = was_timeout
				self.output = output
				self.returncode = exit_status
				self.completed_event.set()


				class ProcessControlTests(unittest.TestCase):
				@classmethod
				def _suppress_soft_terminate(cls, command):
				# Do the right thing for your platform here.
				# Right now only POSIX-y systems are reporting
				zturnerUnsubmitted Done Reply Inline Actions Change `nt` to `Windows` (unless you're removing this logic as discussed earlier) zturner: Change `nt` to `Windows` (unless you're removing this logic as discussed earlier)
				tfialaAuthorUnsubmitted Done Reply Inline Actions Ah okay. I was copying the "nt" from somewhere else in the source code. We might want to grep for that if it is wholesale wrong. I assumed that was leftover from transition over to the NT codebase in W2k time period. tfiala: Ah okay. I was copying the "nt" from somewhere else in the source code. We might want to grep…
				zturnerUnsubmitted Done Reply Inline Actions It's really confusing. There's `os.name` and `sys.platform()`, which which return different strings. It's possible the one you saw was checking `os.name`, for which `nt` is one of the valid return values. That's another good candidate for the cross-platform portability module, we could just have lldb_platform.os() which returns an enum. I was a little bummed to see the `process_control` module as a subfolder of `test_runner`, because all of the helper-related stuff could be at a higher level usable by anywhere in the test suite. But we can tackle that later. zturner: It's really confusing. There's `os.name` and `sys.platform()`, which which return different…
				zturnerUnsubmitted Done Reply Inline Actions Oh yea, in addition to `sys.platform` there's also `platform.system`, as you've used here. And even those don't return the same values. zturner: Oh yea, in addition to `sys.platform` there's also `platform.system`, as you've used here. And…
				# soft terminate support, so this is set up for
				# those.
				helper = process_control.ProcessHelper.process_helper()
				signals = helper.soft_terminate_signals()
				if signals is not None:
				for signum in helper.soft_terminate_signals():
				command.extend(["--ignore-signal", str(signum)])

				@classmethod
				def inferior_command(
				cls,
				ignore_soft_terminate=False,
				options=None):

				# Base command.
				command = ([sys.executable, "inferior.py"])

				zturnerUnsubmitted Done Reply Inline Actions Change `"python"` to `sys.executable` zturner: Change `"python"` to `sys.executable`
				tfialaAuthorUnsubmitted Done Reply Inline Actions Okay. Better. tfiala: Okay. Better.
				if ignore_soft_terminate:
				cls._suppress_soft_terminate(command)

				# Handle options as string or list.
				if isinstance(options, str):
				command.extend(options.split())
				elif isinstance(options, list):
				command.extend(options)

				# Return full command.
				return command


				class ProcessControlNoTimeoutTests(ProcessControlTests):
				"""Tests the process_control module."""
				def test_run_completes(self):
				"""Test that running completes and gets expected stdout/stderr."""
				driver = TestInferiorDriver()
				driver.run_command(self.inferior_command())
				self.assertTrue(
				driver.completed_event.wait(5), "process failed to complete")
				self.assertEqual(driver.returncode, 0, "return code does not match")

				def test_run_completes_with_code(self):
				"""Test that running completes and gets expected stdout/stderr."""
				driver = TestInferiorDriver()
				driver.run_command(self.inferior_command(options="-r10"))
				self.assertTrue(
				driver.completed_event.wait(5), "process failed to complete")
				self.assertEqual(driver.returncode, 10, "return code does not match")


				class ProcessControlTimeoutTests(ProcessControlTests):
				def test_run_completes(self):
				"""Test that running completes and gets expected return code."""
				driver = TestInferiorDriver()
				timeout_seconds = 5
				driver.run_command_with_timeout(
				self.inferior_command(),
				"{}s".format(timeout_seconds),
				False)
				self.assertTrue(
				driver.completed_event.wait(2*timeout_seconds),
				"process failed to complete")
				self.assertEqual(driver.returncode, 0)

				def _soft_terminate_works(self, with_core):
				# Skip this test if the platform doesn't support soft ti
				helper = process_control.ProcessHelper.process_helper()
				if not helper.supports_soft_terminate():
				self.skipTest("soft terminate not supported by platform")

				driver = TestInferiorDriver()
				timeout_seconds = 5

				driver.run_command_with_timeout(
				# Sleep twice as long as the timeout interval. This
				# should force a timeout.
				self.inferior_command(
				options="--sleep {}".format(timeout_seconds*2)),
				"{}s".format(timeout_seconds),
				with_core)

				# We should complete, albeit with a timeout.
				self.assertTrue(
				driver.completed_event.wait(2*timeout_seconds),
				"process failed to complete")

				# Ensure we received a timeout.
				self.assertTrue(driver.was_timeout, "expected to end with a timeout")

				self.assertTrue(
				helper.was_soft_terminate(driver.returncode, with_core),
				("timeout didn't return expected returncode "
				"for soft terminate with core: {}").format(driver.returncode))

				def test_soft_terminate_works_core(self):
				"""Driver uses soft terminate (with core request) when process times out.
				"""
				self._soft_terminate_works(True)

				def test_soft_terminate_works_no_core(self):
				"""Driver uses soft terminate (no core request) when process times out.
				"""
				self._soft_terminate_works(False)

				def test_hard_terminate_works(self):
				"""Driver falls back to hard terminate when soft terminate is ignored.
				"""

				driver = TestInferiorDriver(soft_terminate_timeout=2.0)
				timeout_seconds = 1

				driver.run_command_with_timeout(
				# Sleep much longer than the timeout interval,forcing a
				# timeout. Do whatever is needed to have the inferior
				# ignore soft terminate calls.
				self.inferior_command(
				ignore_soft_terminate=True,
				options="--sleep 120"),
				"{}s".format(timeout_seconds),
				True)

				# We should complete, albeit with a timeout.
				self.assertTrue(
				driver.completed_event.wait(60),
				"process failed to complete")

				# Ensure we received a timeout.
				self.assertTrue(driver.was_timeout, "expected to end with a timeout")

				helper = process_control.ProcessHelper.process_helper()
				self.assertTrue(
				helper.was_hard_terminate(driver.returncode),
				("timeout didn't return expected returncode "
				"for hard teriminate: {} ({})").format(
				driver.returncode,
				driver.output))

				if __name__ == "__main__":
				unittest.main()

This is an archive of the discontinued LLVM Phabricator instance.

test runner: switch to pure-Python timeout mechanismClosedPublic

Details

Diff Detail

Event Timeline

..E.EE

Driver falls back to hard terminate when soft terminate is ignored.

Driver uses soft terminate (with core request) when process times out.

Driver uses soft terminate (no core request) when process times out.

..E.EE

Driver falls back to hard terminate when soft terminate is ignored.

Driver uses soft terminate (with core request) when process times out.

Driver uses soft terminate (no core request) when process times out.

Revision Contents

Diff 36043

test/dosep.py

test/test_runner/README.txt

test/test_runner/lib/lldb_utils.py

test/test_runner/lib/process_control.py

test/test_runner/test/inferior.py

test/test_runner/test/process_control_tests.py

test runner: switch to pure-Python timeout mechanism
ClosedPublic