This is an archive of the discontinued LLVM Phabricator instance.

Merge dsym and dwarf test cases
ClosedPublic

Authored by tberghammer on Sep 21 2015, 11:48 AM.

Details

Summary

Merge dsym and dwarf test cases

Currently most of the test files have a separate dwarf and a separate dsym test with almost identical content (only the build step is different). With adding dwo symbol file handling to the test suit it would increase this to a 3-way duplication. The purpose of this change is to eliminate this redundancy with generating 2 test case (one dwarf and one dsym) for each test function specified (dwo handling will be added at a later commit).

Main design goals:

  • There should be no boilerplate code in each test file to support the multiple debug info in most of the tests (custom scenarios are acceptable in special cases) so adding a new test case is easier and we can't miss one of the debug info type
  • In case of a test failure, the debug symbols used during the test run have to be cleanly visible from the output of dotest.py to make debugging easier both from build bot logs and from local test runs
  • Each test case should have a unique, fully qualified name so we can run exactly 1 test with "-f <test-case>.<test-function>" syntax
  • Test output should be grouped based on test files the same way as it happens now (displaying dwarf/dsym results separately isn't preferable)

Proposed solution (main logic in lldbtest.py, rest of them are test cases fixed up for the new style):

  • Have only 1 test fuction in the test files what will run for all debug info separately and this test function should call just "self.build(...)" to build an inferior with the right debug info
  • When a class is created by python (the class object, not the class instance), we will generate a new test method for each debug info format in the test class with the name "<test-function>_<debug-info>" and remove the original test method. This way unittest2 see multiple test methods (1 for each debug info, pretty much as of now) and will handle the test selection and the failure reporting correctly (the debug info will be visible from the end of the test name)
  • Add new annotation @no_debug_info_test to disable the generation of multiple tests for each debug info format when the test don't have an inferior

Diff Detail

Event Timeline

tberghammer retitled this revision from to [RFC] Merge dsym and dwarf test cases.
tberghammer updated this object.
tberghammer added a subscriber: lldb-commits.
emaste edited edge metadata.Sep 21 2015, 1:10 PM

Haven't reviewed in depth yet but in general I really like this idea.

test/lldbtest.py
524

...as a test that doesn't use...

clayborg edited edge metadata.Sep 21 2015, 1:20 PM

I like the idea. One question I have is if tests should opt in to multi-debug, or if tests should opt out. This solution does the opt out method. Would it be better to opt in? With something like:

@debug_info_test
def test_stuff(self):
    # ...

I am fine with either, I am just checking in case people have a preference and which one makes more sense. I like opting out because most of our tests will probably need the multiple debug info formats. Also, are we verifying that the compiler supports each different kind of mode? Or do we just try it and we would fail of the compiler say didn't handle DWO debugging?

Also if the compiler doesn't support the different modes do we avoid mutating the classes? Or if we disable the extra DWARF in .o file mode (be it DWO or Apple) do we avoid mutating the classes?

I decided to make it an opt-out feature as most of the test case will use some debug info and this way you don't have to specify it explicitly (mutating a test what don't have any debug info only have performance hit because it will run the same test multiple times).

About mutating, my current plan is to mutate every test isn't marked as no_debug_info_test and mark the test with dwarf_test / dsym_test / dwo_test annotations. The purpose of these annotation would be to enable the debug info based test filtering (e.g. the already existing "-N dwarf" option) and it should also test if the current system (compiler, OS) supports the given debug info format and skip the test if necessary (the dsym_test annotation already skip the test on non-darwin systems).

Other option is to generate the mutated tests only on platforms where they are supported. It have the advantage of removing a huge number of skipped tests and reducing the noise with it, but it also makes it more difficult to compare platforms (and possibly mask away some basic coding errors). I don't have a strong opinion about the two options so if somebody have preferences then let me know.

zturner edited edge metadata.Sep 21 2015, 3:58 PM
zturner added a subscriber: zturner.

This is great, I've been thinking about implementing the exact same thing
myself.

Is it possible to remove the need to call self.build()? It seems like the
the wrapped_run function could do it before invoking the actual test
method. This way a large majority of tests could be made simpler.

If you want to customize the build process, you could use a decorator like:

@build_properties(build=False)

or

@build_properties(build=True, clean=False)

or

@build_properties(dictionary={foo})

If no @build_properties decorator is present, it just calls self.build()
before invoking the test.

Thoughts?

tberghammer retitled this revision from [RFC] Merge dsym and dwarf test cases to Merge dsym and dwarf test cases.
tberghammer updated this object.
tberghammer edited edge metadata.

Update all tests to use the new dsym/dwarf generation method where applicable and do minor cleanups on the tests:

  • Inline methods where the number of call sites decreased to 1
  • Remove trivial setUp() methods
  • Fix incorrect skip/xfail markings

At the current stage there are no regression on Linux (x86_64, i386) /Android (arm).

On OSX there are 3 new test what are failing after this change because beforehand they were running only with dsym and currently the dwarf version of them is failing (we haven't tested them before). The list of test cases where this issue happens:

  • TestHiddenIvars
  • TestObjCIvarStripped
  • TestObjCStaticMethodStripped

Greg: What should I do with these cases? (Should I disable them as they shouldn't run with dwarf, should I XFAIL them or should I just let them being failed?)

Ed, Zachary: I haven't tested this change on FreeBSD/Windows. I expect no regression there but it would be good if you can try it out and let me know how it goes.

Further plans with the test suit (based on available resources):

  • Add logic to run tests with dwo debug info
  • Possibly implement the suggestion of Zachary about moving the build stage into the wrapper function (I am nor sure it will help the readability too much)
  • Possibly remove the setUp functions from the test cases where their content can be moved to other functions without code duplication (in several case it just initialize a line_number variable)

I am at CppCon all week. If you need to get this in before Monday, can you
have Oleksiy or Chaoren test on Windows? Otherwise I can take a look next
week.

clayborg accepted this revision.Sep 22 2015, 10:34 AM
clayborg edited edge metadata.

Looks good. For the failing test cases, just check this stuff in and we should take care or marking any needed tests and expected fail. Is there a way to mark a test such that the "dwarf in .o files" will expected fail if we need to do that? If there currently isn't, we need to add the ability to say that normal dwarf (dSYM file for MacOSX or dwarf in object file for all other platforms), DWO, and dwarf in .o files (apple) can be marked as expected fail, skip, etc...

This revision is now accepted and ready to land.Sep 22 2015, 10:34 AM

Looks good. For the failing test cases, just check this stuff in and we should take care or marking any needed tests and expected fail. Is there a way to mark a test such that the "dwarf in .o files" will expected fail if we need to do that? If there currently isn't, we need to add the ability to say that normal dwarf (dSYM file for MacOSX or dwarf in object file for all other platforms), DWO, and dwarf in .o files (apple) can be marked as expected fail, skip, etc...

I added an [expectedFailure,expectedFlakey,skipIf][Dwarf,Dsym] (will add Dwo whit the rest of the dwo specific logic) decorators and we can also specify the debug info for expectedFailureAll. If we need more decorators then we can add them the first time we have to use them (the debug info fomrat is stored in self.debug_info as a string)

At some point I really would love to reduce the number of decorators. It's
starting to get ridiculous :)

Seems like we only need one decorator that takes everything as optional
arguments

tfiala accepted this revision.Sep 22 2015, 4:07 PM
tfiala edited edge metadata.

LGTM as well. Thanks!

tberghammer updated this object.
tberghammer edited edge metadata.

Re-base to ToT.

I plan to commit this CL in on Wednesday morning if there is no objection from Windows/FreeBSD side.

No objection from me on the FreeBSD side. I'm leaving shortly to travel to EuroBSDCon so probably will not be able to test before Wednesday, but I can investigate and update decorators as appropriate afterwards.

This revision was automatically updated to reflect the committed changes.
emaste added a comment.Oct 1 2015, 1:05 AM

I'm leaving shortly to travel to EuroBSDCon so probably will not be able to test before Wednesday, but I can investigate and update decorators as appropriate afterwards.

A follow up - the tests look good on FreeBSD after this change. On my branch with a few updates to FreeBSD expected failure decorators,

[1/1] Testing LLDB (parallel execution, with a separate subprocess per test)
Testing: 399 test suites, 8 threads
399 out of 399 test suites processed - TestMiFile.py                           
Ran 399 test suites (0 failed) (0.000000%)
Ran 322 test cases (0 failed) (0.000000%)

Unexpected Successes (1)
UNEXPECTED SUCCESS: LLDB (suite) :: TestRecursiveInferior.py (FreeBSD feynman 10.2-STABLE FreeBSD 10.2-STABLE #44 r288174+7644546(stable-10): Thu Sep 24 14:02:11 EDT 2015     emaste@feynman:/tank/emaste/obj/tank/emaste/src/git-stable-10/sys/GENERIC amd64 amd64)
      224.52 real       665.99 user       366.72 sys

Hi Tamas,

This breaks TestBuiltinTrap on Windows. Do you have any insight about why
this might be happening? That test is not modified at all by this patch
from what I can tell, but a bisect pinpoints this CL as the reason for the
failure.

Actually nevermind, the changed file list was cut off by the email, so I didn't see that it had been modified. It looks like you just didn't copy over the @skipIfWindows decorator. I can fix that easily, sorry for the noise.