This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/docs/
-
docs/
-
HowToCrossCompileLLVM.rst

Differential D142404

[docs] Prefer setting LLVM_HOST_TRIPLE instead of LLVM_DEFAULT_TARGET_TRIPLE and LLVM_TARGET_ARCH
ClosedPublic

Authored by mstorsjo on Jan 23 2023, 1:52 PM.

Download Raw Diff

Details

Reviewers

phosek
peter.smith
beanz

Commits

rGcb19e3b20d92: [docs] Prefer setting LLVM_HOST_TRIPLE instead of LLVM_DEFAULT_TARGET_TRIPLE…

Summary

Setting LLVM_HOST_TRIPLE propagates the information to a few more
places than if only setting LLVM_TARGET_ARCH and
LLVM_DEFAULT_TARGET_TRIPLE, while both of those settings get their
defaults implied from LLVM_HOST_TRIPLE if they're not overridden.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

mstorsjo created this revision.Jan 23 2023, 1:52 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 23 2023, 1:52 PM

mstorsjo requested review of this revision.Jan 23 2023, 1:52 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 23 2023, 1:52 PM

I believe this is wrong? You're specifying the host triple, i.e. the platform on which the (built) compiler should run.

tstellar added a subscriber: tstellar.Jan 23 2023, 2:20 PM

Neither the current doc, nor the proposed change are really always right, and they are doing different things.

The point of the existing option is to tell you how to setup the default target (the target implied when no target is specified) to be the native architecture of your cross target rather than the host (which is what it defaults to).

In theory, LLVM_HOST_TRIPLE should be inferable from the build configuration environment so you should never need to specify it explicitly.

In D142404#4074957, @barannikov88 wrote:

I believe this is wrong? You're specifying the host triple, i.e. the platform on which the (built) compiler should run.

Yes - but this whole article is about cross compiling LLVM so that the compiler itself will run on a different architecture. When doing that, AFAIK it's customary to tell the LLVM CMake build system what kind of triple it actually is running on, i.e. setting LLVM_HOST_TRIPLE is possibly relevant whenever cross compiling.

Then secondly, if you're on OS/arch X, and are cross compiling LLVM to run on OS/arch Y, then it's of course possible to give it a default target triple and for a third OS/arch Z - but as far as I understood this article, it's about a case where Y and Z are equal, i.e. running on whatever system, building LLVM to run on ARM, to generate code for ARM.

Plus, since LLVM_TARGET_ARCH is the target to use for JIT generation, it essentially needs to be the same architecture as the host on which LLVM is going to run, so it can't really be set to a wildly different arch anyway?

In D142404#4074961, @beanz wrote:

The point of the existing option is to tell you how to setup the default target (the target implied when no target is specified) to be the native architecture of your cross target rather than the host (which is what it defaults to).

If I cross compile a LLVM to run on Linux/AArch64 and configure it with LLVM_HOST_TRIPLE=aarch64-linux-gnu, then this also implicitly sets LLVM_DEFAULT_TARGET_TRIPLE to the same, unless I have manually set another value for LLVM_DEFAULT_TARGET_TRIPLE - or do you disagree on this bit?

In theory, LLVM_HOST_TRIPLE should be inferable from the build configuration environment so you should never need to specify it explicitly.

LLVM_HOST_TRIPLE is generally inferrable when _not_ cross compiling, but when cross compiling, AFAIK we don't quite infer it. If LLVM_HOST_TRIPLE isn't set, it's defaulted to LLVM_INFERRED_HOST_TRIPLE which is set with get_host_triple: https://github.com/llvm/llvm-project/blob/35912ad39d8a0f244f36d24526ec70b8b028a6e0/llvm/cmake/config-ix.cmake#L441-L445 For some targets/OSes, get_host_triple does try to figure out the cross target host triple, but for the generic fallback case, it's simply set to the build host by running the config.guess script: https://github.com/llvm/llvm-project/blob/35912ad39d8a0f244f36d24526ec70b8b028a6e0/llvm/cmake/modules/GetHostTriple.cmake#L48

So for e.g. cross compilation to Linux targets, as far as I can see, you do need to set LLVM_HOST_TRIPLE manually as it will otherwise default to that of the machine where you are doing the cross compilation.

In D142404#4074962, @mstorsjo wrote:

Plus, since LLVM_TARGET_ARCH is the target to use for JIT generation, it essentially needs to be the same architecture as the host on which LLVM is going to run, so it can't really be set to a wildly different arch anyway?

I think you're misunderstanding how some of this works (or maybe rather the implications of it). As a concrete example: If my host build development is Ubuntu-x86, and I'm building LLVM to run on Android-AArch64, and I'm building a JIT to run on Android-AArch64. I NEED the LLVM_TARGET_ARCH to be AArch64, otherwise my JIT when run on Android will attempt to target x86.

I also _probably_ want the default target triple to be aarch64-linux-..., because I probably want the clang I build to infer AArch64-linux as its default architecture.

Setting LLVM_HOST_TRIPLE to Aarch64 on my x86 machine is likely to cause lots of problems, instead allowing it to be inferred from my build machine is appropriate. LLVM_HOST_TRIPLE should only be set explicitly in the odd case where my host machine's architecture and OS can't be identified by our build system.

In D142404#4074967, @beanz wrote:

I think you're misunderstanding how some of this works (or maybe rather the implications of it). As a concrete example: If my host build development is Ubuntu-x86, and I'm building LLVM to run on Android-AArch64, and I'm building a JIT to run on Android-AArch64. I NEED the LLVM_TARGET_ARCH to be AArch64, otherwise my JIT when run on Android will attempt to target x86.

Yes, I agree

I also _probably_ want the default target triple to be aarch64-linux-..., because I probably want the clang I build to infer AArch64-linux as its default architecture.

I also agree

Setting LLVM_HOST_TRIPLE to Aarch64 on my x86 machine is likely to cause lots of problems, instead allowing it to be inferred from my build machine is appropriate. LLVM_HOST_TRIPLE should only be set explicitly in the odd case where my host machine's architecture and OS can't be identified by our build system.

No, here I disagree. LLVM_HOST_TRIPLE is documented as Host on which LLVM binaries will run, not as the host where I'm currently compiling it. We can easily infer the details of the OS where we're doing the build, but usually much less so for the cross target, where the cross compiled LLVM will run.

In D142404#4074962, @mstorsjo wrote:

Yes - but this whole article is about cross compiling LLVM so that the compiler itself will run on a different architecture. When doing that, AFAIK it's customary to tell the LLVM CMake build system what kind of triple it actually is running on, i.e. setting LLVM_HOST_TRIPLE is possibly relevant whenever cross compiling.

Ah, I get it! Sorry for the noise.

In D142404#4074985, @mstorsjo wrote:

No, here I disagree. LLVM_HOST_TRIPLE is documented as Host on which LLVM binaries will run, not as the host where I'm currently compiling it. We can easily infer the details of the OS where we're doing the build, but usually much less so for the cross target, where the cross compiled LLVM will run.

Ooof... That is the most terribly named variable ever. You are right. I kinda hate the idea of documenting this because that variable name is unnecessarily confusing. In fact, the line directly above the line you changed uses the word host to mean something completely different.

In D142404#4074990, @beanz wrote:

In D142404#4074985, @mstorsjo wrote:

No, here I disagree. LLVM_HOST_TRIPLE is documented as Host on which LLVM binaries will run, not as the host where I'm currently compiling it. We can easily infer the details of the OS where we're doing the build, but usually much less so for the cross target, where the cross compiled LLVM will run.

Ooof... That is the most terribly named variable ever. You are right. I kinda hate the idea of documenting this because that variable name is unnecessarily confusing.

Yeah, it's not really great - but changing it would be kinda a lot of churn for all users who are cross compiling LLVM.

Anyway, my main point here is that whenever you're cross compiling, you more or less do need to set LLVM_HOST_TRIPLE - but you generally don't need to set LLVM_TARGET_ARCH and LLVM_DEFAULT_TARGET_TRIPLE unless you're doing a really, really exotic build. So the documentation should probably explain the most basic cross compilation case, not the most exotic one.

In fact, the line directly above the line you changed uses the word host to mean something completely different.

Ouch, I hadn't noticed that detail. We probably should reword those bits too, to make it even clearer.

Harbormaster completed remote builds in B209469: Diff 491504.Jan 23 2023, 3:08 PM

In D142404#4075020, @mstorsjo wrote:

In D142404#4074990, @beanz wrote:

In D142404#4074985, @mstorsjo wrote:

No, here I disagree. LLVM_HOST_TRIPLE is documented as Host on which LLVM binaries will run, not as the host where I'm currently compiling it. We can easily infer the details of the OS where we're doing the build, but usually much less so for the cross target, where the cross compiled LLVM will run.

Ooof... That is the most terribly named variable ever. You are right. I kinda hate the idea of documenting this because that variable name is unnecessarily confusing.

Yeah, it's not really great - but changing it would be kinda a lot of churn for all users who are cross compiling LLVM.

I think that name comes from autoconf which uses build (machine where the software is being built), host (machine where the software is going to run) and target (machine we're going to generate code for).

Since we're already on this topic, in D137451 it was also brought up that having both LLVM_DEFAULT_TARGET_TRIPLE and LLVM_TARGET_TRIPLE is confusing and that perhaps we should only have one (presumably the latter).

In runtimes, we currently use LLVM_DEFAULT_TARGET_TRIPLE to construct the installation path but that's a ongoing source of issues. Neither LLVM_HOST_TRIPLE nor LLVM_TARGET_TRIPLE seem like the right replacement, since those variables are exported in LLVMConfig.cmake but in the runtimes build (which uses LLVMConfig.cmake) we need to set the triple based on the host we're compiling runtimes for, not based on the host we compiled LLVM for.

The solution I came up with in D137451 is introducing a new variable LLVM_RUNTIME_TRIPLE to avoid conflict with any of the existing variables. Do you have any other suggestions?

In D142404#4075147, @phosek wrote:

Since we're already on this topic, in D137451 it was also brought up that having both LLVM_DEFAULT_TARGET_TRIPLE and LLVM_TARGET_TRIPLE is confusing and that perhaps we should only have one (presumably the latter).

Hmm, I haven't quite followed exactly what LLVM_TARGET_TRIPLE is and which parts of the code it affects. I don't offhand know where it would be relevant, since a LLVM build supports multiple targets.

I agree that LLVM_DEFAULT_TARGET_TRIPLE is confusing and IMO incorrect for the runtimes. But for building LLVM/Clang level code generation, it's a totally valid option though.

In runtimes, we currently use LLVM_DEFAULT_TARGET_TRIPLE to construct the installation path but that's a ongoing source of issues. Neither LLVM_HOST_TRIPLE nor LLVM_TARGET_TRIPLE seem like the right replacement

IMO, if we'd follow the autoconf build/host/target nomenclature strictly, then LLVM_HOST_TRIPLE would be the correct name for it; within the context of the runtimes, that denotes what host the compiled code will be running on.

since those variables are exported in LLVMConfig.cmake but in the runtimes build (which uses LLVMConfig.cmake)

Ok, so the LLVMConfig.cmake from the surrounding LLVM build ends up included in the cmake builds of the individual cross built runtimes, contaminating these variables with values from the host? That's kinda non-ideal.

IMO, we ideally should avoid including that entirely, or at least filter out such settings which are incorrect here. Anything within LLVMConfig.cmake which is about the host of the LLVM build (arch/executable suffix/triples/etc) should be filtered out altogether. At most some parts that relate to the autoconf-labelled "build" environment can be reasonable to include, since the autoconf "build" environment is the same across both - I guess built tools like FileCheck are propagated this way?

we need to set the triple based on the host we're compiling runtimes for, not based on the host we compiled LLVM for.

The solution I came up with in D137451 is introducing a new variable LLVM_RUNTIME_TRIPLE to avoid conflict with any of the existing variables. Do you have any other suggestions?

I guess that sounds reasonable. LLVM_DEFAULT_TARGET_TRIPLE is at least kinda wrong. I haven't tried to track what LLVM_TARGET_TRIPLE actually does though, but either that or an entirely new variable is probably fine. LLVM_HOST_TRIPLE would be the technically correct but I guess it's messy, especially as long as compiler-rt still is expected to work in a somewhat-cross nature as a project within the main llvm build.

I'm not very familiar with building runtimes, but in case it helps others:

In D142404#4075115, @phosek wrote:

I think that name comes from autoconf which uses build (machine where the software is being built), host (machine where the software is going to run) and target (machine we're going to generate code for).

There is further explanation here.

In the case of target libraries, the machine you’re building for is the machine you specified with --target. So, build is the machine you’re building on (no change there), host is the machine you’re building for (the target libraries are built for the target, so host is the target you specified), and target doesn’t apply (because you’re not building a compiler, you’re building libraries). The configure/make process will adjust these variables as needed.

I.e. if you used --build=A --host=B --target=C (building cross compiler on machine A that will run on machine B and generate code for machine C) for building gcc, it will use --build=A --host=C when building libraries (--target is not applicable).

clang is multi-target and does thus not have --target equivalent (LLVM_DEFAULT_TARGET_TRIPLE is just the default target). So, when we build the compiler and the runtimes at the same time, we do need some kind of LLVM_RUNTIME_TRIPLE.
For example,
-DLLVM_HOST_TRIPLE=A -DLLVM_DEFAULT_TARGET_TRIPLE=B -DLLVM_RUNTIME_TRIPLE=C
would build [multi-target] clang that runs on machine A, will be used to build runtimes for machine C, but by default generate code for machine B.

When building runtime libraries only, LLVM_DEFAULT_TARGET_TRIPLE is not applicable.

Added a paragraph explaining what LLVM_HOST_TRIPLE really signifies here. I'll take care of path-to-host-bin for LLVM_NATIVE_TOOL_DIR in a separate patch.

mstorsjo mentioned this in D142960: [docs] Rewrite/improve the docs for LLVM_NATIVE_TOOL_DIR.Jan 31 2023, 1:14 AM

mstorsjo added a child revision: D142960: [docs] Rewrite/improve the docs for LLVM_NATIVE_TOOL_DIR.Jan 31 2023, 1:14 AM

Harbormaster completed remote builds in B210927: Diff 493513.Jan 31 2023, 2:09 AM

No objections. The intention is to cross-compile LLVM to run on another target and this change preserves that. I'm not well versed in the subtle differences between the CMake variables so I'm happy for others to take the lead on that part.

LGTM

This revision is now accepted and ready to land.Feb 2 2023, 9:30 AM

This revision was landed with ongoing or failed builds.Feb 3 2023, 12:57 AM

Closed by commit rGcb19e3b20d92: [docs] Prefer setting LLVM_HOST_TRIPLE instead of LLVM_DEFAULT_TARGET_TRIPLE… (authored by mstorsjo). · Explain Why

This revision was automatically updated to reflect the committed changes.

mstorsjo added a commit: rGcb19e3b20d92: [docs] Prefer setting LLVM_HOST_TRIPLE instead of LLVM_DEFAULT_TARGET_TRIPLE….

Revision Contents

Path

Size

llvm/

docs/

HowToCrossCompileLLVM.rst

8 lines

Diff 494537

llvm/docs/HowToCrossCompileLLVM.rst

	Show All 37 Lines
	For more information on how to configure CMake for LLVM/Clang,			For more information on how to configure CMake for LLVM/Clang,
	see :doc:`CMake`.			see :doc:`CMake`.

	The CMake options you need to add are:			The CMake options you need to add are:

	* ``-DCMAKE_SYSTEM_NAME=<target-system>``			* ``-DCMAKE_SYSTEM_NAME=<target-system>``
	* ``-DCMAKE_INSTALL_PREFIX=<install-dir>``			* ``-DCMAKE_INSTALL_PREFIX=<install-dir>``
	* ``-DLLVM_NATIVE_TOOL_DIR=<path-to-host-bin>``			* ``-DLLVM_NATIVE_TOOL_DIR=<path-to-host-bin>``
	* ``-DLLVM_DEFAULT_TARGET_TRIPLE=arm-linux-gnueabihf``			* ``-DLLVM_HOST_TRIPLE=arm-linux-gnueabihf``
	* ``-DLLVM_TARGET_ARCH=ARM``
	* ``-DLLVM_TARGETS_TO_BUILD=ARM``			* ``-DLLVM_TARGETS_TO_BUILD=ARM``

	Note: ``CMAKE_CROSSCOMPILING`` is always set automatically when ``CMAKE_SYSTEM_NAME`` is set. Don't put ``-DCMAKE_CROSSCOMPILING=TRUE`` in your options.			Note: ``CMAKE_CROSSCOMPILING`` is always set automatically when ``CMAKE_SYSTEM_NAME`` is set. Don't put ``-DCMAKE_CROSSCOMPILING=TRUE`` in your options.

				Also note that ``LLVM_HOST_TRIPLE`` specifies the triple of the system
				that the cross built LLVM is going to run on - the flag is named based
				on the autoconf build/host/target nomenclature. (This flag implicitly sets
				other defaults, such as ``LLVM_DEFAULT_TARGET_TRIPLE``.)

	If you're compiling with GCC, you can use architecture options for your target,			If you're compiling with GCC, you can use architecture options for your target,
	and the compiler driver will detect everything that it needs:			and the compiler driver will detect everything that it needs:

	* ``-DCMAKE_CXX_FLAGS='-march=armv7-a -mcpu=cortex-a9 -mfloat-abi=hard'``			* ``-DCMAKE_CXX_FLAGS='-march=armv7-a -mcpu=cortex-a9 -mfloat-abi=hard'``

	However, if you're using Clang, the driver might not be up-to-date with your			However, if you're using Clang, the driver might not be up-to-date with your
	specific Linux distribution, version or GCC layout, so you'll need to fudge.			specific Linux distribution, version or GCC layout, so you'll need to fudge.

	▲ Show 20 Lines • Show All 150 Lines • Show Last 20 Lines