Setting LLVM_HOST_TRIPLE propagates the information to a few more
places than if only setting LLVM_TARGET_ARCH and
LLVM_DEFAULT_TARGET_TRIPLE, while both of those settings get their
defaults implied from LLVM_HOST_TRIPLE if they're not overridden.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
I believe this is wrong? You're specifying the host triple, i.e. the platform on which the (built) compiler should run.
Neither the current doc, nor the proposed change are really always right, and they are doing different things.
The point of the existing option is to tell you how to setup the default target (the target implied when no target is specified) to be the native architecture of your cross target rather than the host (which is what it defaults to).
In theory, LLVM_HOST_TRIPLE should be inferable from the build configuration environment so you should never need to specify it explicitly.
Yes - but this whole article is about cross compiling LLVM so that the compiler itself will run on a different architecture. When doing that, AFAIK it's customary to tell the LLVM CMake build system what kind of triple it actually is running on, i.e. setting LLVM_HOST_TRIPLE is possibly relevant whenever cross compiling.
Then secondly, if you're on OS/arch X, and are cross compiling LLVM to run on OS/arch Y, then it's of course possible to give it a default target triple and for a third OS/arch Z - but as far as I understood this article, it's about a case where Y and Z are equal, i.e. running on whatever system, building LLVM to run on ARM, to generate code for ARM.
Plus, since LLVM_TARGET_ARCH is the target to use for JIT generation, it essentially needs to be the same architecture as the host on which LLVM is going to run, so it can't really be set to a wildly different arch anyway?
If I cross compile a LLVM to run on Linux/AArch64 and configure it with LLVM_HOST_TRIPLE=aarch64-linux-gnu, then this also implicitly sets LLVM_DEFAULT_TARGET_TRIPLE to the same, unless I have manually set another value for LLVM_DEFAULT_TARGET_TRIPLE - or do you disagree on this bit?
In theory, LLVM_HOST_TRIPLE should be inferable from the build configuration environment so you should never need to specify it explicitly.
LLVM_HOST_TRIPLE is generally inferrable when _not_ cross compiling, but when cross compiling, AFAIK we don't quite infer it. If LLVM_HOST_TRIPLE isn't set, it's defaulted to LLVM_INFERRED_HOST_TRIPLE which is set with get_host_triple: https://github.com/llvm/llvm-project/blob/35912ad39d8a0f244f36d24526ec70b8b028a6e0/llvm/cmake/config-ix.cmake#L441-L445 For some targets/OSes, get_host_triple does try to figure out the cross target host triple, but for the generic fallback case, it's simply set to the build host by running the config.guess script: https://github.com/llvm/llvm-project/blob/35912ad39d8a0f244f36d24526ec70b8b028a6e0/llvm/cmake/modules/GetHostTriple.cmake#L48
So for e.g. cross compilation to Linux targets, as far as I can see, you do need to set LLVM_HOST_TRIPLE manually as it will otherwise default to that of the machine where you are doing the cross compilation.
I think you're misunderstanding how some of this works (or maybe rather the implications of it). As a concrete example: If my host build development is Ubuntu-x86, and I'm building LLVM to run on Android-AArch64, and I'm building a JIT to run on Android-AArch64. I NEED the LLVM_TARGET_ARCH to be AArch64, otherwise my JIT when run on Android will attempt to target x86.
I also _probably_ want the default target triple to be aarch64-linux-..., because I probably want the clang I build to infer AArch64-linux as its default architecture.
Setting LLVM_HOST_TRIPLE to Aarch64 on my x86 machine is likely to cause lots of problems, instead allowing it to be inferred from my build machine is appropriate. LLVM_HOST_TRIPLE should only be set explicitly in the odd case where my host machine's architecture and OS can't be identified by our build system.
Yes, I agree
I also _probably_ want the default target triple to be aarch64-linux-..., because I probably want the clang I build to infer AArch64-linux as its default architecture.
I also agree
Setting LLVM_HOST_TRIPLE to Aarch64 on my x86 machine is likely to cause lots of problems, instead allowing it to be inferred from my build machine is appropriate. LLVM_HOST_TRIPLE should only be set explicitly in the odd case where my host machine's architecture and OS can't be identified by our build system.
No, here I disagree. LLVM_HOST_TRIPLE is documented as Host on which LLVM binaries will run, not as the host where I'm currently compiling it. We can easily infer the details of the OS where we're doing the build, but usually much less so for the cross target, where the cross compiled LLVM will run.
Ooof... That is the most terribly named variable ever. You are right. I kinda hate the idea of documenting this because that variable name is unnecessarily confusing. In fact, the line directly above the line you changed uses the word host to mean something completely different.
Yeah, it's not really great - but changing it would be kinda a lot of churn for all users who are cross compiling LLVM.
Anyway, my main point here is that whenever you're cross compiling, you more or less do need to set LLVM_HOST_TRIPLE - but you generally don't need to set LLVM_TARGET_ARCH and LLVM_DEFAULT_TARGET_TRIPLE unless you're doing a really, really exotic build. So the documentation should probably explain the most basic cross compilation case, not the most exotic one.
In fact, the line directly above the line you changed uses the word host to mean something completely different.
Ouch, I hadn't noticed that detail. We probably should reword those bits too, to make it even clearer.
I think that name comes from autoconf which uses build (machine where the software is being built), host (machine where the software is going to run) and target (machine we're going to generate code for).
Since we're already on this topic, in D137451 it was also brought up that having both LLVM_DEFAULT_TARGET_TRIPLE and LLVM_TARGET_TRIPLE is confusing and that perhaps we should only have one (presumably the latter).
In runtimes, we currently use LLVM_DEFAULT_TARGET_TRIPLE to construct the installation path but that's a ongoing source of issues. Neither LLVM_HOST_TRIPLE nor LLVM_TARGET_TRIPLE seem like the right replacement, since those variables are exported in LLVMConfig.cmake but in the runtimes build (which uses LLVMConfig.cmake) we need to set the triple based on the host we're compiling runtimes for, not based on the host we compiled LLVM for.
The solution I came up with in D137451 is introducing a new variable LLVM_RUNTIME_TRIPLE to avoid conflict with any of the existing variables. Do you have any other suggestions?
Hmm, I haven't quite followed exactly what LLVM_TARGET_TRIPLE is and which parts of the code it affects. I don't offhand know where it would be relevant, since a LLVM build supports multiple targets.
I agree that LLVM_DEFAULT_TARGET_TRIPLE is confusing and IMO incorrect for the runtimes. But for building LLVM/Clang level code generation, it's a totally valid option though.
In runtimes, we currently use LLVM_DEFAULT_TARGET_TRIPLE to construct the installation path but that's a ongoing source of issues. Neither LLVM_HOST_TRIPLE nor LLVM_TARGET_TRIPLE seem like the right replacement
IMO, if we'd follow the autoconf build/host/target nomenclature strictly, then LLVM_HOST_TRIPLE would be the correct name for it; within the context of the runtimes, that denotes what host the compiled code will be running on.
since those variables are exported in LLVMConfig.cmake but in the runtimes build (which uses LLVMConfig.cmake)
Ok, so the LLVMConfig.cmake from the surrounding LLVM build ends up included in the cmake builds of the individual cross built runtimes, contaminating these variables with values from the host? That's kinda non-ideal.
IMO, we ideally should avoid including that entirely, or at least filter out such settings which are incorrect here. Anything within LLVMConfig.cmake which is about the host of the LLVM build (arch/executable suffix/triples/etc) should be filtered out altogether. At most some parts that relate to the autoconf-labelled "build" environment can be reasonable to include, since the autoconf "build" environment is the same across both - I guess built tools like FileCheck are propagated this way?
we need to set the triple based on the host we're compiling runtimes for, not based on the host we compiled LLVM for.
The solution I came up with in D137451 is introducing a new variable LLVM_RUNTIME_TRIPLE to avoid conflict with any of the existing variables. Do you have any other suggestions?
I guess that sounds reasonable. LLVM_DEFAULT_TARGET_TRIPLE is at least kinda wrong. I haven't tried to track what LLVM_TARGET_TRIPLE actually does though, but either that or an entirely new variable is probably fine. LLVM_HOST_TRIPLE would be the technically correct but I guess it's messy, especially as long as compiler-rt still is expected to work in a somewhat-cross nature as a project within the main llvm build.
I'm not very familiar with building runtimes, but in case it helps others:
There is further explanation here.
In the case of target libraries, the machine you’re building for is the machine you specified with --target. So, build is the machine you’re building on (no change there), host is the machine you’re building for (the target libraries are built for the target, so host is the target you specified), and target doesn’t apply (because you’re not building a compiler, you’re building libraries). The configure/make process will adjust these variables as needed.
I.e. if you used --build=A --host=B --target=C (building cross compiler on machine A that will run on machine B and generate code for machine C) for building gcc, it will use --build=A --host=C when building libraries (--target is not applicable).
clang is multi-target and does thus not have --target equivalent (LLVM_DEFAULT_TARGET_TRIPLE is just the default target). So, when we build the compiler and the runtimes at the same time, we do need some kind of LLVM_RUNTIME_TRIPLE.
For example,
-DLLVM_HOST_TRIPLE=A -DLLVM_DEFAULT_TARGET_TRIPLE=B -DLLVM_RUNTIME_TRIPLE=C
would build [multi-target] clang that runs on machine A, will be used to build runtimes for machine C, but by default generate code for machine B.
When building runtime libraries only, LLVM_DEFAULT_TARGET_TRIPLE is not applicable.
Added a paragraph explaining what LLVM_HOST_TRIPLE really signifies here. I'll take care of path-to-host-bin for LLVM_NATIVE_TOOL_DIR in a separate patch.
No objections. The intention is to cross-compile LLVM to run on another target and this change preserves that. I'm not well versed in the subtle differences between the CMake variables so I'm happy for others to take the lead on that part.