This is an archive of the discontinued LLVM Phabricator instance.

Add a generic build script for building test inferiors
ClosedPublic

Authored by zturner on Nov 26 2018, 1:39 PM.

Details

Summary

This is an alternative approach to D54731. Instead of allow the user to invoke arbitrary python code directly from inside of a test, this patch adds a python script called build.py which we can use the existing substitution system for. For example, we could write something like:

RUN: %build --arch=32 --source=%p/Inputs/foo.cpp --output=%t.exe --mode=compile-and-link

Currently, the build script requires you to pass an explicit compiler, so you would need to pass something like --compiler=path/to/clang++, and it will fail without this. So this isn't quite ready for immediate use. We need to figure out how to get that compiler from CMake into this script. Maybe it means bringing back LLDB_TEST_COMPILER (sigh). Maybe it means auto-detecting the most appropriate compiler by looking in the tools dir, perhaps with something like an additional command line --find-toolchain=msvc or --find-toolchain=clang.

We have some tests right now that require MSVC, so while we can always just keep using a cl.exe command line, it would be nice to be able to port these over and say something like

RUN: %build --find-toolchain=msvc --arch=32 --source=%p/Inputs/foo.cpp --output=%t.exe --find-toolchain=msvc --mode=compile-and-link

For now though, I've tested this on the command line with an explicit --compiler option, and it works with clang-cl + lld-link as well as cl.exe + link.exe, using both Python 2 and Python 3.

I have not implemented the GCC/clang builder yet, although I plan to do that in followups.

Note that at least for Windows, this new builder is significantly better than our existing method of using run lines, because it has advanced toolchain detection logic. Our existing scripts fail when being run from inside of Visual Studio and they also don't allow building an explicit architecture for technical reasons, but with this method we can support both.

Diff Detail

Repository
rLLDB LLDB

Event Timeline

zturner created this revision.Nov 26 2018, 1:39 PM

I personally prefer this approach, if only for the fact that the substitutions are more familiar and that this change would be local to lldb.

zturner updated this revision to Diff 175344.Nov 26 2018, 2:50 PM

It turned out to not be super complicated, so I went ahead and made it support toolchain auto-detection. This allows the substitution to bake in a command line parameter of --compiler=any, so the user doesn't have to specify anything when writing the test. But since subsequent arguments take precedence, they can pass --compiler=clang or something later and it will override the default value. I also made the arch default to 64 bit. Finally, I updated a test to actually use this to prove that it works.

zturner updated this revision to Diff 175366.Nov 26 2018, 4:58 PM

I went ahead and fixed up the rest of the native pdb tests. This exposed a couple of minor issues with the build script which I also fixed in the process.

@stella.stamenova would you mind giving this a try? I think it will fix all your problems of running tests from inside of Visual Studio.

@aleksandr.urakov Would you also mind giving this a try? I think it will mean that you can re-write the SymbolFile/PDB tests to not be dependent on running from a VS command prompt. You should be able to pass --arch=32 or --arch=64 no matter which command prompt you are running from, and it should still work.

@aleksandr.urakov Would you also mind giving this a try? I think it will mean that you can re-write the SymbolFile/PDB tests to not be dependent on running from a VS command prompt. You should be able to pass --arch=32 or --arch=64 no matter which command prompt you are running from, and it should still work.

Sure, thank you for the new build system! I've created a review here: D54942. There are some problems left, I've explained them in the review description.

I didn't look at the code in detail, as most of it deals with windows stuff, and I don't know much about those anyway. However, the interesting question for me would be how to make this useful for cross-compiling. Right now that sort of works for NativePDB tests because --compiler=clang-cl implies windows, but that won't help if I want to compile say a linux arm64 binary. I think that instead --arch, we should have a --triple argument, which specifies the exact target you want to build for. That can default to "host", and we can have special pseudo-triples like "host32" and "host64", if we want to be able to say "I want to build for a 32-bit flavour of the host arch". That could also make gcc detection easier, since you could just search for $triple-gcc.

Another route to take might be to say that this script only supports building for the host, and tests that want to target a specific architecture can just invoke the appropriate compiler directly -- since they know the target arch, they also know the correct compiler option syntax. Plus, these kinds of tests are the ones that are most likely to need fancy compiler command line switches that might be hard to represent generically. In this world, the NativePDB tests would continue to invoke %clang-cl (which would point to the build dir without any fancy logic), and build.py would be used by scripts that just want to build an executable for the host, and don't care much about the details of how this is accomplished (the stop-hook tests are a good example of this).

(Among the cosmetic things, I'd suggest to make --source a positional argument and enable spelling of --output as -o, just to make things shorter.)

I would like to ask a general question that I (indirectly) also asked in D54731: Why do we want to implement support for building inferiors in LIT-based tests? IMHO, if we need to support for dealing with specific compilers, we should implement that once in Makefile.rules (which is in a declarative domain-specific-language for describing build logic) and write a dotest.py-style test instead. I'm assuming here that we need the support in Makefile.rules anyway to support the bulk of the tests. Having this support in two places increases the maintenance effort and cognitive load.

I would like to ask a general question that I (indirectly) also asked in D54731: Why do we want to implement support for building inferiors in LIT-based tests?

I think this is not much different than asking "Why do we want LIT-based tests?". If you can't build an inferior, then you can't really do anything with LLDB at all, in which case there's nothing to test. So without support for building inferiors, there is no such thing as lldb lit tests.

IMHO, if we need to support for dealing with specific compilers, we should implement that once in Makefile.rules (which is in a declarative domain-specific-language for describing build logic) and write a dotest.py-style test instead. I'm assuming here that we need the support in Makefile.rules anyway to support the bulk of the tests. Having this support in two places increases the maintenance effort and cognitive load.

At least for the time being, I think of dotest.py style tests as being synonymous with "SB API test". I think there is a need for a testing framework that is more familiar to LLVM developers with a lower barrier to entry as it encourages people to write more tests, which is always a good thing. FWIW, I think the results so far have been positive. I did a search over a 3-month period on the repository. During that time, 37 new lit tests have been added, and 32 dotest tests have been added while 9 have been removed.

Your point about having the support in two places increasing the maintenance effort is reasonable, but if the build script actually does support everything that the Makefile does, then perhaps we could port the existing dotest tests over to use the build script. Make is notoriously finnicky and many people don't like it or understand it well and it's very easy to run into portability problems when using it. So if we can eventually do that, that would actually *decrease* the cognitive load and maintenance burden. I'm not proposing that at this point and I think the benefits of a system such as the one in this patch stand on their own regardless of whether we port dotest builders over to use this script, but it's something to think about regardless.

Note that the Makefiles do not currently support clang-cl since it uses a totally different command line syntax.

zturner updated this revision to Diff 175561.Nov 27 2018, 12:54 PM

Updated with suggestions. Currently all NativePDB tests pass with this version of the script, and all other tests are unaffected since they don't use the new substitution.

Subsequent followups will focus on getting other tests using this, which should increase our test coverage across multiple platforms.

I didn't look at the code in detail, as most of it deals with windows stuff, and I don't know much about those anyway. However, the interesting question for me would be how to make this useful for cross-compiling. Right now that sort of works for NativePDB tests because --compiler=clang-cl implies windows, but that won't help if I want to compile say a linux arm64 binary. I think that instead --arch, we should have a --triple argument, which specifies the exact target you want to build for. That can default to "host", and we can have special pseudo-triples like "host32" and "host64", if we want to be able to say "I want to build for a 32-bit flavour of the host arch". That could also make gcc detection easier, since you could just search for $triple-gcc.

A triple is one way. I don't know much about gcc, does it support the same triple format as clang? clang-cl certainly doesn't support triples since it implies one. Also, are there any existing use cases for specifying a triple in the lit test suite? I do agree we will need the ability to support triple specification, but it seems better to do this as a followup. I don't think it should be too hard as long as we can make some assumptions such as "specification of --triple limits the possible supported compilers", etc.

Another route to take might be to say that this script only supports building for the host, and tests that want to target a specific architecture can just invoke the appropriate compiler directly -- since they know the target arch, they also know the correct compiler option syntax. Plus, these kinds of tests are the ones that are most likely to need fancy compiler command line switches that might be hard to represent generically. In this world, the NativePDB tests would continue to invoke %clang-cl (which would point to the build dir without any fancy logic), and build.py would be used by scripts that just want to build an executable for the host, and don't care much about the details of how this is accomplished (the stop-hook tests are a good example of this).

That's another good possibility.

(Among the cosmetic things, I'd suggest to make --source a positional argument and enable spelling of --output as -o, just to make things shorter.)

I did both of these in the latest update. I need to make it possible to specify multiple inputs, but I'll do that in a followup as there's no immediate need for it in any of the tests I converted.

I didn't look at the code in detail, as most of it deals with windows stuff, and I don't know much about those anyway. However, the interesting question for me would be how to make this useful for cross-compiling. Right now that sort of works for NativePDB tests because --compiler=clang-cl implies windows, but that won't help if I want to compile say a linux arm64 binary. I think that instead --arch, we should have a --triple argument, which specifies the exact target you want to build for. That can default to "host", and we can have special pseudo-triples like "host32" and "host64", if we want to be able to say "I want to build for a 32-bit flavour of the host arch". That could also make gcc detection easier, since you could just search for $triple-gcc.

A triple is one way. I don't know much about gcc, does it support the same triple format as clang?

Gcc is different that clang in that the target triple is baked into the binary. For example, on my machine, I have a binary called x86_64-pc-linux-gnu-gcc (gcc is just a symlink to that), which always produces binaries for x86_64 linux. If I had a arm64 cross-gcc installed, it would be called something like aarch64-linux-gnu-gcc and so on. So the actual mechanism to search for the compiler and invoke it will be different, but a triple is still a good starting point.

Also, are there any existing use cases for specifying a triple in the lit test suite?

You can take a look at the tests in lit/SymbolFile/DWARF. Their purpose is very similar to the NativePDB tests -- check the operation of the dwarf parser, regardless of the host platform. Since mac uses a somewhat different flavour of dwarf than other platforms, the sources are usually compiled twice, targetting linux and mac.

I do agree we will need the ability to support triple specification, but it seems better to do this as a followup. I don't think it should be too hard as long as we can make some assumptions such as "specification of --triple limits the possible supported compilers", etc.

I don't think there needs to be much code initially, but I think it would be good to decide on the direction up front. For example, if we decide to go the triple way then we could avoid the churn in the tests by making all of them use --triple=x86_64-pc-windows instead of the current --arch=64, even if the actual implementation simply aborts in the more complicated cases.

I would like to ask a general question that I (indirectly) also asked in D54731: Why do we want to implement support for building inferiors in LIT-based tests? IMHO, if we need to support for dealing with specific compilers, we should implement that once in Makefile.rules (which is in a declarative domain-specific-language for describing build logic) and write a dotest.py-style test instead. I'm assuming here that we need the support in Makefile.rules anyway to support the bulk of the tests. Having this support in two places increases the maintenance effort and cognitive load.

While I agree that having two ways to do something is bad, I have to say that the current Makefile.rules system is far from perfect. First of, it is geared towards creating a fully working executables (the kind that dotest tests expect), and if you want to do something more exotic, things get a lot harder. Also writing more complex logic in makefiles is hard, particularly when you need to be portable across various shells and make implementations. In fact, the biggest feature of make, automatic dependency tracking, is unused now that we build tests out of tree and clean by nuking the build dir.

I didn't look at the code in detail, as most of it deals with windows stuff, and I don't know much about those anyway. However, the interesting question for me would be how to make this useful for cross-compiling. Right now that sort of works for NativePDB tests because --compiler=clang-cl implies windows, but that won't help if I want to compile say a linux arm64 binary. I think that instead --arch, we should have a --triple argument, which specifies the exact target you want to build for. That can default to "host", and we can have special pseudo-triples like "host32" and "host64", if we want to be able to say "I want to build for a 32-bit flavour of the host arch". That could also make gcc detection easier, since you could just search for $triple-gcc.

A triple is one way. I don't know much about gcc, does it support the same triple format as clang?

Gcc is different that clang in that the target triple is baked into the binary. For example, on my machine, I have a binary called x86_64-pc-linux-gnu-gcc (gcc is just a symlink to that), which always produces binaries for x86_64 linux. If I had a arm64 cross-gcc installed, it would be called something like aarch64-linux-gnu-gcc and so on. So the actual mechanism to search for the compiler and invoke it will be different, but a triple is still a good starting point.

Also, are there any existing use cases for specifying a triple in the lit test suite?

You can take a look at the tests in lit/SymbolFile/DWARF. Their purpose is very similar to the NativePDB tests -- check the operation of the dwarf parser, regardless of the host platform. Since mac uses a somewhat different flavour of dwarf than other platforms, the sources are usually compiled twice, targetting linux and mac.

I do agree we will need the ability to support triple specification, but it seems better to do this as a followup. I don't think it should be too hard as long as we can make some assumptions such as "specification of --triple limits the possible supported compilers", etc.

I don't think there needs to be much code initially, but I think it would be good to decide on the direction up front. For example, if we decide to go the triple way then we could avoid the churn in the tests by making all of them use --triple=x86_64-pc-windows instead of the current --arch=64, even if the actual implementation simply aborts in the more complicated cases.

I think it would be good to use the way dotest works as a starting point. You can specify --arch, and then you can run the test on multiple arches this way by running dotest several times in succession, each with different arch flags.

I think a --triple option could be useful in limited scenarios, but I think that most of the use cases will not need it. Most tests will probably want to specify nothing at all and let the lit configuration pass the correct information through. Actually, I think this is the same with the --arch flag though. Because as soon as you specify something, then it limits the ability of the test to run over and over with different parameters.

Perhaps we could support *both* a --triple and a --arch flag. Maybe we can make the script error if they're both specified. This would give each test the capability to be written in the simplest way possible.

I ran the tests with this change and there are about a dozen new failures:

2018-11-28T20:25:29.2437909Z          ********************
2018-11-28T20:25:29.2581909Z          FAIL: LLDB :: SymbolFile/NativePDB/function-types-builtins.cpp (16962 of 45532)
2018-11-28T20:25:29.2593650Z          ******************** TEST 'LLDB :: SymbolFile/NativePDB/function-types-builtins.cpp' FAILED ********************
2018-11-28T20:25:29.2627299Z          Script:
2018-11-28T20:25:29.2627541Z          --
2018-11-28T20:25:29.2628080Z          : 'RUN: at line 4';   C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\python.exe E:\_work\80\s\llvm\tools\lldb\lit\helper\build.py --compiler=any --arch=64 --tools-dir=E:/_work/80/b/LLVMBuild/Release/bin --compiler=clang-cl --nodefaultlib -o E:\_work\80\b\LLVMBuild\tools\lldb\lit\SymbolFile\NativePDB\Output\function-types-builtins.cpp.tmp.exe -- E:\_work\80\s\llvm\tools\lldb\lit\SymbolFile\NativePDB\function-types-builtins.cpp
2018-11-28T20:25:29.2629494Z          : 'RUN: at line 5';   env LLDB_USE_NATIVE_PDB_READER=1 E:\_work\80\b\LLVMBuild\Release\bin\lldb.EXE -S E:/_work/80/s/llvm/tools/lldb/lit\lit-lldb-init -f E:\_work\80\b\LLVMBuild\tools\lldb\lit\SymbolFile\NativePDB\Output\function-types-builtins.cpp.tmp.exe -s      E:\_work\80\s\llvm\tools\lldb\lit\SymbolFile\NativePDB/Inputs/function-types-builtins.lldbinit | E:\_work\80\b\LLVMBuild\Release\bin\FileCheck.EXE E:\_work\80\s\llvm\tools\lldb\lit\SymbolFile\NativePDB\function-types-builtins.cpp
2018-11-28T20:25:29.2629805Z          --
2018-11-28T20:25:29.2629901Z          Exit Code: 127
2018-11-28T20:25:29.2630016Z          
2018-11-28T20:25:29.2630140Z          Command Output (stdout):
2018-11-28T20:25:29.2630258Z          --
2018-11-28T20:25:29.2630427Z          $ ":" "RUN: at line 4"
2018-11-28T20:25:29.2630915Z          $ "C:\Program" "Files" "(x86)\Microsoft" "Visual" "Studio\Shared\Python36_64\python.exe" "E:\_work\80\s\llvm\tools\lldb\lit\helper\build.py" "--compiler=any" "--arch=64" "--tools-dir=E:/_work/80/b/LLVMBuild/Release/bin" "--compiler=clang-cl" "--nodefaultlib" "-o" "E:\_work\80\b\LLVMBuild\tools\lldb\lit\SymbolFile\NativePDB\Output\function-types-builtins.cpp.tmp.exe" "--" "E:\_work\80\s\llvm\tools\lldb\lit\SymbolFile\NativePDB\function-types-builtins.cpp"
2018-11-28T20:25:29.2631273Z          # command stderr:
2018-11-28T20:25:29.2631382Z          'C:\\Program': command not found

It looks like the script doesn't handle paths that have spaces correctly. I think the issue is on line 45 where we add sys.executable as the command - we need quotes around that. I'm going to rerun the tests with that change to see the result.

lldb/lit/helper/toolchain.py
45 ↗(On Diff #175561)

I think this is where we need quotes around sys.executable, so that if the path contains spaces it is handled correctly

It works for me when you make the command:

command='"%s"' % (sys.executable),

I haven't tested it on Linux yet, but the tests work fine on Windows.

I think it would be good to use the way dotest works as a starting point. You can specify --arch, and then you can run the test on multiple arches this way by running dotest several times in succession, each with different arch flags.

I think a --triple option could be useful in limited scenarios, but I think that most of the use cases will not need it. Most tests will probably want to specify nothing at all and let the lit configuration pass the correct information through. Actually, I think this is the same with the --arch flag though. Because as soon as you specify something, then it limits the ability of the test to run over and over with different parameters.

I am not sure about their relative ratio, but yes, I agree that we have (or want to have) two kinds of tests. One would specify the target and the other would not.

Perhaps we could support *both* a --triple and a --arch flag. Maybe we can make the script error if they're both specified. This would give each test the capability to be written in the simplest way possible.

Yes, I suppose that would work

If there are no further comments I'd like to get this in later today.

This revision was not accepted when it landed; it landed in state Needs Review.Nov 30 2018, 4:25 PM
This revision was automatically updated to reflect the committed changes.