Page MenuHomePhabricator

Add support for batch-testing to the LLDB testsuite.
Needs ReviewPublic

Authored by bbli on Jun 11 2020, 4:19 PM.

Details

Summary

[LLDB] Hi, this is a patch for the proposed GSoC project "Add support for batch-testing to the LLDB testsuite". The project aimed to add continuation functionality when multiple assertions are called one after another, similar to how GoogleTest's EXPECT_* macros work. Test results from running "ninja check-lldb" were the same before and after the patch, and I have included a test case for the new functionality.

Diff Detail

Event Timeline

bbli created this revision.Jun 11 2020, 4:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 11 2020, 4:19 PM
labath added a subscriber: labath.Jun 12 2020, 4:28 AM

it looks like the official python unittest framework has grown support of a subtest feature, which seems to be very similar to what we want here. It may make more sense to update to a newer framework rather than patching the copy that we have. Even if can't update for some reason, it would be good to do something equivalent to that so that it is easier to update in the future.

The part I like about the upstream solution is that it provides scoping for the failures -- so a single failure terminates a "subtest", but still lets the whole test continue. The reason this is important is when you have more complex "assertion" functions consisting of multiple smaller assertions. A typical example would be something like:

def check_foo(foo):
  assertGreater(len(foo), 2)
  assertEquals(foo[0], 47)
  assertEquals(foo[1], 42)
  ...

def test():
  with self.batchTest():
    check_foo(get_one_foo())
    check_foo(get_another_foo())
    ...

This patch essentially makes all assertions non-fatal, which means that if a first assertion in check_foo fails, the function will still continue and spew a bunch of uninteresting errors. With subtests, that won't happen -- the first assertion will abort the rest of the check_foo function, but it will let the other call continue (if it's in a different subtest):

with self.subtest(): check_foo(get_one_foo())
with self.subtest(): check_foo(get_another_foo())

This can result in a bit more typing, but that can be resolved by adding some extra sugar on top of that.For example, since the typical use case for this will be to run a bunch of independent expression (or frame var) commands, we could make a special version of the expect_expr command, which runs the expression in a subtest. (Or that might even be the default.)

bbli added a comment.Jun 12 2020, 6:33 PM
  • Ohh ok, guess I should have gotten the point about catching all exceptions recursively clarified before I began.
  • From looking at the code for subTest, the error catching functionality comes from Outcome_object.testPartExecutor context manager, and the rest seems to be subTest specific. The repo's version of unittest2 has not defined the Outcome class, however. Should I port over just this class then? In regards to updating to the newer framework, I personally am not confident in myself doing so as I do not know the exact specifics of why the fork was made, and from what I have been told, the version of unittest2 we have in the repo is about 10 years old.
  • Also, testPartExecutor will catch all exceptions, but I believe we want to exit early and propagate non self.failureException errors all the way up for runMethod to handle, correct?
  • Finally, by expect_expr, are you referring to the "expect*" methods that are defined in lldbtest.py? And are you suggesting to have the error handling context manager to be called inside these functions? Because wouldn't that lead to the same problem of having a function continue in the situation where the function has multiple expect_expr commands?
bbli added a comment.Jul 1 2020, 6:10 PM

Hi, bumping my post from two weeks ago. The main question I had was: would it be ok if I just brought over the Outcome object for the time being?

labath added a comment.Jul 2 2020, 8:34 AM

Hi, bumping my post from two weeks ago. The main question I had was: would it be ok if I just brought over the Outcome object for the time being?

Umm... I don't know... Maybe? I suppose it depends on how invasive the change would be... If you're willing to give it a try, I am willing to look at the result, but I gotta warn you that I am pretty averse (and I know some of the other contributors are too) about modifying imported third-party code.

OTOH, if you're just looking to do something, I can think of other helpful fixes that would be a lot less controversial. For example, the thing that's annoying me now is that whenever a test fails in the "make" phase (e.g. a compile error), we only get an error message with the return status, and the original make invocation. It would be very helpful if that error could include the entire make output, including all the compile commands that it ran. In my book that would be more helpful (and a lot easier) than the batch mode.

I'd also be happy to help you with attempting to migrate to a newer unittest framework, if you're willing to give that a shot. Even if we don't end up doing it, I think just tracking down the original version this was forked from, and enumerating our changes to it (and their reasons) would be helpful.

  • Also, testPartExecutor will catch all exceptions, but I believe we want to exit early and propagate non self.failureException errors all the way up for runMethod to handle, correct?

I don't have a strong opinion on that and I'm happy to go with whatever the upstream version does. In theory a different exception should not mean a failure, but a problem in the test itself, but in practice it usually does not work out that way -- a lot of failures manifest themselves AttributeError: 'NoneType' object has no attribute 'foo' errors (because an object ended up being null when it shouldn't be) or similar. And I'm not sure that littering the test with self.assertNotNone(foo) is worth it...

  • Finally, by expect_expr, are you referring to the "expect*" methods that are defined in lldbtest.py? And are you suggesting to have the error handling context manager to be called inside these functions? Because wouldn't that lead to the same problem of having a function continue in the situation where the function has multiple expect_expr commands?

I was specifically referring to the expect_expr method, not the entire expect function family. You're right that this would mean that the caller of expect_expr would continue in its work, but what I wanted to do is make this an official part of the expect_expr contract. expect_expr is a fairly new function, and the way we currently use that is just to inspect values of various objects. There's no harm in batching those. We could just add some wording to its documentation to say that one should not use that function to mutate state as that could lead to very confusing error messages if that command fails.

bbli added a comment.Jul 2 2020, 3:32 PM

For example, the thing that's annoying me now is that whenever a test fails in the "make" phase (e.g. a compile error), we only get an error message with the return status, and the original make invocation.

Sure thing, how should I get started(in particular where in the codebase should I look for the compile aspect of this fix)?

I'd also be happy to help you with attempting to migrate to a newer unittest framework, if you're willing to give that a shot.

Hmm, how about I give it a shot after working on the above, so I get more background/learn more about the inner workings of unittest2 first.

labath added a comment.Jul 3 2020, 8:02 AM

For example, the thing that's annoying me now is that whenever a test fails in the "make" phase (e.g. a compile error), we only get an error message with the return status, and the original make invocation.

Sure thing, how should I get started(in particular where in the codebase should I look for the compile aspect of this fix)?

I believe the relevant code should be somewhere in packages/Python/lldbsuite/test/plugins/. The way I'd approach this is to deliberately introduce a compile error into a test and then work my way back to where is that printed. E.g. if I currently add CFLAGS_EXTRAS := -std=c99 -foo to test/API/sample_test/Makefile I get the following error:

Error when building test subject.

Build Command:
make VPATH=.../lldb/test/API/sample_test -C .../lldb-test-build.noindex/sample_test/TestSampleInlineTest.test_dwarf .... 

Build Command Output:
clang-10: error: unknown argument: '-foo'
clang-10: error: unknown argument: '-foo'
make: *** [<builtin>: main.o] Error 1

Ideally this error should include the actual command line that caused clang to print this error message, as normally the problem will not be as obvious.

I'd also be happy to help you with attempting to migrate to a newer unittest framework, if you're willing to give that a shot.

Hmm, how about I give it a shot after working on the above, so I get more background/learn more about the inner workings of unittest2 first.

Well.. I think this will be a relatively simple fix, so I'm afraid you won't gain any deep unittest knowledge from doing that. But let's see how that goes... We have lots of areas that need improvement, so we can come up with additional warm-up projects :)

bbli added a comment.Jul 3 2020, 1:02 PM

Ideally this error should include the actual command line

So to clarify, you want to also print out the exact clang command/what flags were passed to clang whenever compile errors happen?

We have lots of areas that need improvement, so we can come up with additional warm-up projects :)

Ohh ok. Sounds good to me!

On another note, would it be more appropriate to be communicating through email rather than on this pull request?

labath added a comment.Jul 7 2020, 9:19 AM

Ideally this error should include the actual command line

So to clarify, you want to also print out the exact clang command/what flags were passed to clang whenever compile errors happen?

Yes, basically the entire output of the make command, as if I had run it in a shell (that's what I normally do when I can't decipher the test suite error). I think it might be possible to get make to print _only_ the command that failed, but I think that's: a) not worth it; b) the content of the other commands can be useful at times.

On another note, would it be more appropriate to be communicating through email rather than on this pull request?

I don't care either way. Feel free to drop me an email if you want.

bbli added a comment.Jul 7 2020, 6:23 PM

Hi, so I think I got the fix working. Attached is a screenshot of the new output, with title "Build Command Stdout Ouput". Should I submit a new pull request for this?

Also just wondering, it seems you guys have added a lldb_extensions dictionary to the CalledProcessError class. Was this monkey-patched in, because I don't see subprocess as one of the vendored third party libraries in the repo?

labath added a comment.Jul 8 2020, 4:49 AM

Hi, so I think I got the fix working. Attached is a screenshot of the new output, with title "Build Command Stdout Ouput". Should I submit a new pull request for this?

Yes, that looks better. Please create a patch for that. It might be even better if the stdout+stderr contents came as a single stream (as if run by cmd 2>&1). That way the error messages on stderr will appear right next to the compiler invocation (which goes to stdout). I believe this could be achieved by setting stderr to STDOUT (instead of PIPE).

Also just wondering, it seems you guys have added a lldb_extensions dictionary to the CalledProcessError class. Was this monkey-patched in, because I don't see subprocess as one of the vendored third party libraries in the repo?

It looks like this just uses the lack of access protections in python to create a new field in a random object (lldbtest.py:442). A cleaner approach would be to create a new class for this exception (perhaps by subclassing CalledProcessError).

bbli added a comment.Jul 8 2020, 6:21 PM

Hi Pavel, so I submitted the new patch(which I added you as a reviewer of) but it looks like the remote build failed. I ran ninja check-lldb locally and it works fine. How would I go about debugging this? I tried clicking on the links to the failed build, but couldn't really interpret it.