This is an archive of the discontinued LLVM Phabricator instance.

[cmake] Add new linux toolchain file
AbandonedPublic

Authored by hintonda on Jan 2 2018, 1:30 AM.

Details

Summary

Add new linux toolchain file that allows cross compiling to
linux from other systems, e.g., Darwin.

Also, add a new variable, ADDITIONAL_CLANG_BOOTSTRAP_DEPS, which
allows adding additional dependencies to clang-bootstrap-deps.

Diff Detail

Event Timeline

hintonda created this revision.Jan 2 2018, 1:30 AM
smeenai added a subscriber: smeenai.Jan 2 2018, 1:54 AM

Why is this a cache file rather than a toolchain file (but passing itself as a toolchain file to CMake under some circumstances?) Aren't toolchain files traditionally used for cross-compilation?

cmake/caches/linux-toolchain.cmake
20

Typo: patches

Why is this a cache file rather than a toolchain file (but passing itself as a toolchain file to CMake under some circumstances?) Aren't toolchain files traditionally used for cross-compilation?

Thanks for taking a look.

Yes, this is for cross-compiling clang+llvm for Linux on Darwin -- and possibly Windows to Linux, but that hasn't been tested -- or Linux to Linux if you have completely different system files. It enforces using --sysroot to find the targets headers and libraries.

Cache files are preferred since they are only loaded once, but toolchain files are more flexible -- particularly when setting -target and --sysroot. Users shouldn't care if the cache file reloads itself as a toolchain file, and keeping everything is one file makes it easier to understand. This version doesn't include arch-specific builtins and runtimes, but that could easily be added.

Also, I'm happy to rename it if that would help.

hintonda updated this revision to Diff 128409.Jan 2 2018, 4:22 AM

Use CMAKE_(C|CXX)_COMPILER_TARGET instead of
CMAKE_(C|CXX)_COMPILER_ARG1, and pass all target variables via
CLANG_BOOTSTRAP_CMAKE_ARGS.

Add variable tests and Fix/update comments.

Cache files are preferred since they are only loaded once

Isn't that precisely the behavior needed for cross-compilation though? You want all of your CMake configuration checks (which are independent CMake configures) to load your toolchain file, which is what you get automatically (and cache files don't behave that way).

From what I understand, the if part of the top-level if(DEFINED SYSROOT) is essentially functioning as a cache file to set up the stage2 build, and the else part is used as a toolchain file for that build. I think it would be cleaner to separate the two out; other cache files seem to be split out into stage1 and stage2 caches, for example (over here it would be stage1 cache and a stage2 toolchain, but the concept is similar).

cmake/caches/linux-toolchain.cmake
3

Cross-compilation terminology is kinda weird, and traditionally, the "host" is actually the system the built binaries will be run on (Linux in this case), whereas the build machine is the "build" (but of course that word is super ambiguous). I think LLVM generally sticks to that terminology though, e.g. LLVM_HOST_TRIPLE.

85

Nit: write this out as a list instead of a string with semicolons? (I know they're equivalent, but the list reads nicer IMO.)

89

Not exactly related, but I wonder why the LLVM build needs ranlib (rather than just invoking ar appropriately).

103

The CMake documentation for CMAKE_SYSTEM_NAME says CMAKE_SYSTEM_VERSION should also be set when cross-compiling (though I haven't seen any ill effects from not doing so). Setting CMAKE_SYSTEM_PROCESSOR probably doesn't hurt either.

beanz added a comment.Jan 2 2018, 8:58 AM

You should split the CMake cache file you created into two files, (1) a CMake Cache to manage the build configuration and (2) a tool chain file for targeting Linux. As @semeenai pointed out we absolutly want the behavior of the toolchain file being loaded multiple times. That is the correct way this build should work.

For bootstrap builds where you want the stage1 to run on your build host, you should be able to set BOOTSTRAP_CMAKE_TOOLCHAIN_FILE in the first stage build, to signal to the first stage build that the second stage will be cross-compiled, and we can customize the multi-stage dependencies correctly based on that. That avoids the need for the ADDITIONAL_CLANG_BOOTSTRAP_DEPS variable, which feels a bit hacky to me.

You should split the CMake cache file you created into two files, (1) a CMake Cache to manage the build configuration and (2) a tool chain file for targeting Linux. As @semeenai pointed out we absolutly want the behavior of the toolchain file being loaded multiple times. That is the correct way this build should work.

I really like keeping this in a single file, but will break it up if necessary.

The if part of the if/else is used in stage1 as a cache file, and the else part used in stage2 (and as you said, is loaded many times). Splitting this into two files won't make much difference in that regard.

For bootstrap builds where you want the stage1 to run on your build host, you should be able to set BOOTSTRAP_CMAKE_TOOLCHAIN_FILE in the first stage build, to signal to the first stage build that the second stage will be cross-compiled, and we can customize the multi-stage dependencies correctly based on that. That avoids the need for the ADDITIONAL_CLANG_BOOTSTRAP_DEPS variable, which feels a bit hacky to me.

Unless there's another way to do it, It's not hacky. I believe the term is escape hatch.

While I'm happy to use BOOTSTRAP_CMAKE_TOOLCHAIN_FILE instead of passing -DCMAKE_TOOLCHAIN_FILE=${CMAKE_CURRENT_LIST_FILE}, I do not see how it helps with this problem. When running ninja stage2, I need to insure that the dependancies where built. BOOTSTRAP_LLVM_ENABLE_LLD can be used to add lld to the dependency list, but since I'm setting CLANG_DEFAULT_LINKER=llb, I don't want clang adding -fuse-ld.

If I run this on Linux for both stages, it doesn't matter, because clang/CMakeLists.txt add llvm-ar and llvm-ranlib automatically, but since I'm on APPLE (see clang/CMakeLists.txt:559), they don't get added.

So, how else would I add them?

cmake/caches/linux-toolchain.cmake
3

I'll work on cleaning up this comment, but the idea is that we cross compile on any host system, e.g., Linux, Darwin, Windows, etc., and target Linux.

85

Perhaps, but this is the style used throughout the clang+llvm cmake files.

89

Darwin version of ranlib doesn't like elf binaries, so we need the one we build in stage1.

103

These can be passed to stage1 as BOOTSTRAP_CMAKE_SYSTEM_VERSION, etc., allowing the user full control. I'll add a note to the comments up top.

You should split the CMake cache file you created into two files, (1) a CMake Cache to manage the build configuration and (2) a tool chain file for targeting Linux. As @semeenai pointed out we absolutly want the behavior of the toolchain file being loaded multiple times. That is the correct way this build should work.

I really like keeping this in a single file, but will break it up if necessary.

The if part of the if/else is used in stage1 as a cache file, and the else part used in stage2 (and as you said, is loaded many times). Splitting this into two files won't make much difference in that regard.

For bootstrap builds where you want the stage1 to run on your build host, you should be able to set BOOTSTRAP_CMAKE_TOOLCHAIN_FILE in the first stage build, to signal to the first stage build that the second stage will be cross-compiled, and we can customize the multi-stage dependencies correctly based on that. That avoids the need for the ADDITIONAL_CLANG_BOOTSTRAP_DEPS variable, which feels a bit hacky to me.

Unless there's another way to do it, It's not hacky. I believe the term is escape hatch.

While I'm happy to use BOOTSTRAP_CMAKE_TOOLCHAIN_FILE instead of passing -DCMAKE_TOOLCHAIN_FILE=${CMAKE_CURRENT_LIST_FILE}, I do not see how it helps with this problem. When running ninja stage2, I need to insure that the dependancies where built. BOOTSTRAP_LLVM_ENABLE_LLD can be used to add lld to the dependency list, but since I'm setting CLANG_DEFAULT_LINKER=llb, I don't want clang adding -fuse-ld.

Did a quick test and setting BOOTSTRAP_CMAKE_TOOLCHAIN_FILE does not work in this case.

If I run this on Linux for both stages, it doesn't matter, because clang/CMakeLists.txt add llvm-ar and llvm-ranlib automatically, but since I'm on APPLE (see clang/CMakeLists.txt:559), they don't get added.

So, how else would I add them?

@hintonda I think this should be a platform file in https://github.com/llvm-mirror/llvm/tree/master/cmake/platforms rather than Clang cache file. Platform files are concerned with the host platform (including cross-compilation), while cache files are related to the distribution setup. What you're trying to do is the former rather than the latter. Some of the aspects of your setup like the bootstrap tool setup is already handled by the 2-stage build so you should use that rather than reimplementing your own solution.

@hintonda I think this should be a platform file in https://github.com/llvm-mirror/llvm/tree/master/cmake/platforms rather than Clang cache file. Platform files are concerned with the host platform (including cross-compilation), while cache files are related to the distribution setup. What you're trying to do is the former rather than the latter. Some of the aspects of your setup like the bootstrap tool setup is already handled by the 2-stage build so you should use that rather than reimplementing your own solution.

Thanks for the pointer. I'll rework the patch along the lines you suggest.

hintonda abandoned this revision.Jan 2 2018, 4:27 PM

Thanks for all your suggestions.

Planning to rework and move the toolchain specific part to llvm/cmake/platform and abandon the cache part altogether.