This is an archive of the discontinued LLVM Phabricator instance.

[LIT] First pass of LLDB LIT support
ClosedPublic

Authored by beanz on Sep 14 2016, 3:07 PM.

Details

Summary

This patch supplies basic infrastructure for LLDB to use LIT, and ports a few basic test cases from the LLDB test suite into LIT.

With this patch the LLDB lit system is not capable or intended to fully replace the existing LLDB test suite, but this first patch enables people to write lit tests for LLDB.

The lit substitution for %cc and %cxx default to the host compiler unless the CMake option LLDB_TEST_CLANG is On, in which case the in-tree clang will be used.

The target check-lldb-lit will run all lit tests including the lit-based executor for the unit tests. Alternatively there is a target generated for each subdirectory under the lit directory, so check-lldb-unit and check-lldb-expr will run just the tests under their respective directories.

The ported tests are not removed from the existing suite, and should not be until such a time when the lit runner is mature and in use by bots and workflows.

Diff Detail

Repository
rL LLVM

Event Timeline

beanz updated this revision to Diff 71450.Sep 14 2016, 3:07 PM
beanz retitled this revision from to [LIT] First pass of LLDB LIT support.
beanz updated this object.
beanz added reviewers: zturner, labath, jingham, tfiala.
beanz added a subscriber: lldb-commits.
zturner edited edge metadata.Sep 14 2016, 3:09 PM

Huh... This is a surprise. Looking forward to reviewing this.

tfiala edited edge metadata.Sep 14 2016, 3:23 PM

Looks like a nice start. Thanks for pulling this together, Chris!

lit/CMakeLists.txt
14 ↗(On Diff #71450)

The macOS bots (and all the Swift ones) build with in-tree clang. I'm glad to have this option.

26 ↗(On Diff #71450)

Not entirely sure what exactly this is checking, but for non-Apple platforms (and maybe Apple in the future), we'll be using an lldb-server binary rather than debugserver. Anything we need to do here to handle that?

lit/lit.cfg
190 ↗(On Diff #71450)

As we discussed earlier, it'll be helpful to be able to parse out the compiler versions so we can do some kind of conditional check on specific version or version ranges for known bugs (or features).

tfiala added inline comments.Sep 14 2016, 3:24 PM
lit/CMakeLists.txt
14 ↗(On Diff #71450)

The macOS bots (and all the Swift ones) build with in-tree clang. I'm glad to have this option.

... test with the in-tree clang, rather.

tfiala accepted this revision.Sep 14 2016, 3:25 PM
tfiala edited edge metadata.

Assuming this doesn't break anybody, this LGTM.

This revision is now accepted and ready to land.Sep 14 2016, 3:25 PM
spyffe added a subscriber: spyffe.Sep 14 2016, 3:25 PM

I like this concept a lot, and I think it's great for testcases that actually need to interact with the command line.
I'm concerned, though, that the separation of input from command files makes tests more complex to write. One thing that's very nice about the inline tests in LLDB is that the test and the source for the debugged program are both in the same file.
If the goal is to port compiler tests across (or just to have tests be more understandable for compiler engineers), I would argue for choosing that model as the default. It would mean a slightly customized LLDB driver but that wouldn't be insanely hard. (We've already done it in Python!)

beanz updated this revision to Diff 71452.Sep 14 2016, 3:29 PM
beanz edited edge metadata.

Added lldb-server as a test dependency based on feedback from @tfiala.

One other items we discussed that is sure to come up:

  • Right now this is geared towards one compile per .test file (similar to something Zachary brought up before). One way we could get the multiple debug info formats handled is to have a .test for each of the formats, but have each of them pull in the common test checks from a shared file in the same directory. We'll have to play with that.

Couple questions:

  1. What is the status of lit and Python 3? Running the test suite on Windows requires Python 3.5+, so if we want this to work on Windows, we will need to make sure the lit infrastructure is compatible with Python 3.
lit/CMakeLists.txt
14 ↗(On Diff #71450)

LLDB's CMake already has an option called LLDB_TEST_COMPILER. Is it possible to re-use that?

We have situations where we want to run the test suite using neither the in-tree clang nor the host compiler, so I think we need to retain this flexibility to specify a path to the compiler.

33 ↗(On Diff #71450)

s/APPELND/APPEND/

lit/lit.cfg
125–126 ↗(On Diff #71450)

Can you use os.path.join instead of hardcoding a forward slash? This might not apply though given my earlier comment about LLDB_TEST_COMPILER.

144 ↗(On Diff #71450)

os.path.join here, and on Windows you will need to add .exe

189–190 ↗(On Diff #71450)

I'm ok with removing this branch. Currently on Windows we require clang.exe as the test compiler, and we use it in cl driver mode, so command lines are mostly interchangeable across platforms. Having to support an entirely different command line syntax would fragment the tests too much.

jingham edited edge metadata.Sep 14 2016, 3:34 PM

Writing tests this way means we're going back to testing the command line commands. That was what gdb did for the most part, and you ended up terrified of making changes that might effect command output, not because the change was hard but because knew you would have to go waste a day or your life making annoying fixes to a whole bunch of tests. This happened in lldb with the command line tests we did have, specifically with the breakpoint reporting. So I had to go make functions for setting breakpoints and making sure they got set correctly - and change all the tests to use them - so I could change just one place when I added output to breakpoint reporting.

We made a choice a while ago to favor tests using the SB API's for this reason. Is the virtue of lit sufficient that we really want to go back on this decision?

One piece of concern that I have is the multiplication of things that "testing LLDB" means.

We have unit tests, Python tests, and now LIT tests.

That means, for each change I want to commit, if I want confidence that I am not breaking anything (which seems like a thing one could conceivably want :-) now I need to know and perform three unrelated incantations instead of one.

If that is the temporary state of affairs while we transition to a better new world, then so be it. But I would definitely want for there to eventually be one true incantation that runs all tests, no matter their flavor.

Writing tests this way means we're going back to testing the command line commands. That was what gdb did for the most part, and you ended up terrified of making changes that might effect command output, not because the change was hard but because knew you would have to go waste a day or your life making annoying fixes to a whole bunch of tests. This happened in lldb with the command line tests we did have, specifically with the breakpoint reporting. So I had to go make functions for setting breakpoints and making sure they got set correctly - and change all the tests to use them - so I could change just one place when I added output to breakpoint reporting.

We made a choice a while ago to favor tests using the SB API's for this reason. Is the virtue of lit sufficient that we really want to go back on this decision?

I had been thinking about this the other day as well. An example that springs to mind that I personally encountered is when a test on Windows was trying to verify that the value of argc was 3. So it was trying to match "argc=3" in stack trace. Seems reasonable, except that the code was actually completely broken on Windows, and argc was pointing to junk memory, which was something like "argc=3239082340982", and the test was passing because this value happened to start with a 3.

One idea might be to add a developer command to LLDB that is able to drill down much deeper to return specific values of interest instead of large blocks of text. For example, imagine commands like this:

(lldb) print-dev stack-frame[0].params[argc]
3
(lldb) print-dev threads.count
7
(lldb) print-dev --hex threads[6].id
0x1234

Now imagine you can get every little nook and cranny of the process's state through a syntax such as this.

As you know I've been a big supporter of testing the API since the beginning for the same reason you mention -- that it makes people too afraid to change the output format -- but to answer your question: I do think the virtue of lit is high enough that we find a way to solve the problem. The above idea is one possibility for reducing the fragility of command line tests, but if we brainstorm we can probably come up with others as well.

It's certainly a lot of work, but I do think the benefits are worth it.

Also, it occurred to me that if all tests were like this (and yes, that's a tall order to imagine a world where not a single test was written using the Python API), we could probably actually drop the Python 3.5 requirement on Windows.

Another thing that's nice about tests like this is that it makes it trivial to see how to reproduce a failure. It's currently very hard to debug failures because you have to first figure out where in the test it's failing (i.e. what line of python), then attach to the python executable and try to get a breakpoint on the native code side in the right SB API call matching up with the place where you determined the test is failing. This is really a pain without a debugger that supports mixed<->native transitions between python and c++, and even with a debugger that does support it like we have on Windows, it often doesn't work very well or exhibits flakiness.

beanz added a comment.Sep 14 2016, 3:56 PM

@zturner, on many of your comments I need to do some research and get back to you. Particularly I need to understand how lit works on Windows better. I do have one inline response.

@granata.enrico, we could migrate the existing tests into being executed by lit even if they aren't using lit's features, so if that direction is desired we could get everything in lit. That said, you shouldn't ever really have multiple incantations. Once we have reliable lit testing that is useful it should be connected to the "check-lldb" and "check-all" targets appropriately. Just because it runs more than one type of test doesn't mean you need multiple incantations, and more and varied testing is generally better for the quality of the product.

@jingham and @zturner, we can also take advantage of FileCheck's use of regular expressions to write robust matchers. In general LLVM has managed to change text output formats many times in radical ways, and LIT's testing has still suited the project well.

And to echo @zturner's last comment, one huge benefit to LIT is that reproducing failures is very simple. LIT failures log simple shell commands that can be executed to reproduce.

lit/CMakeLists.txt
14 ↗(On Diff #71452)

The LLDB_TEST_COMPILER option doesn't signify that it is using an in-tree or out-of-tree compiler which is significant if we're going to tie the test target to depending on the clang target. We could support that option in addition to this one, but I see them as distinctly different.

@granata.enrico, we could migrate the existing tests into being executed by lit even if they aren't using lit's features, so if that direction is desired we could get everything in lit. That said, you shouldn't ever really have multiple incantations. Once we have reliable lit testing that is useful it should be connected to the "check-lldb" and "check-all" targets appropriately. Just because it runs more than one type of test doesn't mean you need multiple incantations, and more and varied testing is generally better for the quality of the product.

The problem is that some of us at Apple build LLDB in Xcode, and then test by saying

$ ./dotest.py

which means no amount of CMake magic will do anything for us. If Xcode builds are supposed to be deprecated and not-to-be-used, that's a possible path forward (but one I am hearing about for the first time...)

Mind you, I would not be opposed to having dotest.py also run lit tests and unit tests - or lit run Python and unit tests as well if that's the preferred direction

I am just worried about the multiplication of solutions we see in LLDB (Xcode build vs. Cmake build, unit tests, vs lit tests vs Python tests, ...) - it would be nice to streamline those at some point. Your patch is just getting friendly fire because it adds one more axis to this space is all :)

beanz updated this revision to Diff 71460.Sep 14 2016, 4:08 PM
beanz edited edge metadata.
  • Use lit.util.which instead of constructing paths, which adds exe suffixes and uses proper path seperators
  • Fixed spelling error in list(APPEND ...) command
beanz added a comment.Sep 14 2016, 4:13 PM

@granata.enrico, it seems to me that this is a problem the people maintaining the Xcode project should address. I know it is possible to create targets in Xcode projects that call shell commands or makefiles (CMake does this). It is therefore possible that the Xcode project could have a "check" target that runs all the tests, and I really don't think we should limit our testing options based on solvable problems.

Todd probably knows about this since he added the gtest target to the Xcode project. I imagine there's a python test suite target and a gtest target. Isn't it possible to have a third target which when you build it, builds the other two targets (which would run the test suites)? If that's not possible, then having a target which invokes a command line would work as beanz mentioned.

zturner added inline comments.Sep 14 2016, 4:18 PM
lit/CMakeLists.txt
14 ↗(On Diff #71452)

At the same time, setting LLDB_TEST_COMPILER to something and LLDB_TEST_CLANG seem to be incompatible with each other. Should we error if both of them are set?

Another possibility is to allow one to set LLDB_TEST_COMPILER to be set to a magic value like <in-tree>

beanz added inline comments.Sep 14 2016, 4:23 PM
lit/CMakeLists.txt
14 ↗(On Diff #71460)

Disallowing setting both seems reasonable to me. I'm not entirely sure how to connect LLDB_TEST_COMPILER up into the lit suite because we really want something that more matches the CMake style of ..._<LANG>_COMPILER so that we could override multiple compilers.

One more question: Is there a way in lit that we can append command line flags to the run lines even if the user doesn't specify them? For example in the substitution?

For example, if someone writes # RUN: %cxx %p/Inputs/call-function.cpp -g -o %t && %lldb -b -s %s -- %t | FileCheck %s then this is always going to run clang function.cpp -g -o. But we need to manipulate the command line on different platforms. Like on Windows we will need to add -fms-compatibility -fuse-ld=lld, and various other things. And on other platforms there are other specific things that always have to be added. Is this possible somehow?

(You don't have to address it in this patch, just curious)

lit/CMakeLists.txt
14 ↗(On Diff #71460)

What if you added a parallel of LLDB_TEST_COMPILER directly in this file, that is used specifically for lit tests? Like LLDB_LIT_TEST_COMPILER or LLDB_LIT_CLANG_PATH?

The only reason I'm harping on this is because as it stands, this won't work on windows. (The host compiler is MSVC, which uses a completely different command line syntax, and the in-tree clang is going to be a debug one when doing a debug build, which is going to be unacceptably slow).

beanz added a comment.Sep 14 2016, 4:38 PM

@zturner, we can absolutely add flags into the substitutions. Other projects do that too.

lit/CMakeLists.txt
14 ↗(On Diff #71460)

We could add parallels. How about instead of trying to match the existing model though we go with something more like LLDB_TEST_C_COMPILER and LLDB_TEST_CXX_COMPILER as values that replace the %cc and %cxx substitutions respectively?

zturner added inline comments.Sep 14 2016, 4:39 PM
lit/CMakeLists.txt
14 ↗(On Diff #71460)

Works for me.

zturner accepted this revision.Sep 14 2016, 7:56 PM
zturner edited edge metadata.

lgtm from my point of view after LLDB_TEST_CXX_COMPILER and LLDB_TEST_C_COMPILER are added.

One question: Currently the tests appear to run all commands until the end of the file, and then do 1 pass over the output through FileCheck. Is this correct?

We have a lot of tests where the command to run next depends on the output of a previous command. If the test is running LLDB commands rather than using the python api, this means you need some pexpect-style solution that can read output and send input to the process as it's running. Does lit have anything like this? I'm pretty sure the answer is no, but just want to check. We can punt on supporting this type of test until later, just want to find out if it's already supported.

labath edited edge metadata.Sep 15 2016, 4:01 AM

The biggest feature I see missing here is the ability to run tests remotely. Remote debugging is the most important use case for our team, and now we have a number of (experimental) bots running the remote test suite. We want to make sure we can debug correctly when the host and target have different architectures, operating systems, pointer sizes, etc.

Some of the implications of this are:

  • we need to be able to fully specify the toolchain used to compile the target executables (specifically, we should not assume the host compiler)
  • this includes the need to pass funny compiler switches based on the target (some android targets can only run -pie executables, some cannot run -pie executables at all).
  • the test suite needs to know how to connect to the remote target (mostly it's just executing three commands: platform select XXX, platform connect and platform settings --working-dir YYY), but it can get a bit more complicated in some cases (if we need to install shared libraries along with the main executable, if it involves the test executable writing something to a file, etc.)
  • we need to be careful about strong cmake integration. The same lldb binary is capable of debugging all kinds of targets. We should not require a fresh cmake build to run the test suite against multiple targets. If we need to switch cmake options (LLDB_TEST_COMPILER, LLDB_TEST_ARCH, LLDB_TEST_URL or similar) and it does not trigger a rebuild then it's usable, but slightly inpractical. For us, it would be best to be able a fire off a remote test with a single command (like we can do now).

I am not against this going in without the remote feature in the first version, but I think we should think hardly about it, as I think it could impact a number of design decisions.

lit/Expr/TestCallStdStringFunction.test
4 ↗(On Diff #71460)

I don't think anyone has tested lldb with icc in the past few years. We can avoid adding those xfails to the new tests.

Btw, I will be in the bay area from 3rd to 7th of October. Maybe we could sit down and talk the design and requirements through in person? I've been hoping to speak to some of you in person anyway, and this could be a good first topic on the agenda...

penryu added a subscriber: penryu.Sep 15 2016, 11:05 AM
beanz updated this revision to Diff 71537.Sep 15 2016, 12:50 PM
beanz edited edge metadata.
  • Removed the XFAIL for icc
  • Added support for LLDB_TEST_<LANG>_COMPILER options

Assuming there are no objections, since the patch has been approved by two reviewers I'll commit it this afternoon.

Thanks,
-Chris

This revision was automatically updated to reflect the committed changes.