This is an archive of the discontinued LLVM Phabricator instance.

[runtimes] Allow LLVM_BUILTIN_TARGETS to include Darwin
ClosedPublic

Authored by smeenai on Aug 20 2020, 12:45 PM.

Details

Summary

We have two ways of using the runtimes build setup to build the
builtins. You can either have an empty LLVM_BUILTIN_TARGETS (or have it
include the "default" target), in which case builtin_default_target is
called to set up the default target, or you can have actual triples in
LLVM_BUILTIN_TARGETS, in which case builtin_register_target is called
for each triple. builtin_default_target lets you build the builtins for
Darwin (assuming your default triple is Darwin); builtin_register_target
does not.

I don't understand the reason for this distinction. The Darwin builtins
build is special in that a single CMake configure handles building the
builtins for multiple platforms (e.g. macOS, iPhoneSimulator, and iOS)
and architectures (e.g. arm64, armv7, and x86_64). Consequently, if you
specify multiple Darwin triples in LLVM_BUILTIN_TARGETS, expecting each
configure to only build for that particular triple, it won't work.
However, if you specify a *single* x86_64-apple-darwin triple in
LLVM_BUILTIN_TARGETS, that single configure will build the builtins for
all Darwin targets, exactly the same way that the default target would.
The only difference between the configuration for the default target and
the x86_64-apple-darwin triple is that the latter runs the configuration
with -DCOMPILER_RT_DEFAULT_TARGET_ONLY=ON, but that makes no
difference for Apple targets (none of the CMake codepaths which have
different behavior based on that variable are run for Apple targets).

I tested this by running two builtins builds on my Mac, one with the
default target and one with the x86_64-apple-darwin19.5.0 target (which
is the default target triple for my clang). The only relevant
CMakeCache.txt difference was the following, and as discussed above, it
has no effect on the actual build for Apple targets:

-//Default triple for which compiler-rt runtimes will be built.
-COMPILER_RT_DEFAULT_TARGET_TRIPLE:STRING=x86_64-apple-darwin19.5.0
+//No help, variable specified on the command line.
+COMPILER_RT_DEFAULT_TARGET_ONLY:UNINITIALIZED=ON

Furthermore, when I add the -D flag to compiler-rt's libtool
invocations, the libraries produced by the two builds are *identical*.

If anything, I would expect builtin_register_target to complain if you
tried specifying a triple for a particular Apple platform triple (e.g.
macosx), since that's the scenario in which it won't work as you want.
The generic darwin triple should be fine though, as best as I can tell.
I'm happy to add the error for specific Apple platform triples, either
in this diff or in a follow-up.

Diff Detail

Event Timeline

smeenai created this revision.Aug 20 2020, 12:45 PM
smeenai requested review of this revision.Aug 20 2020, 12:45 PM

I think this change makes configuration confusingly inaccurate. Since the compiler-rt Darwin build does not support picking and choosing which targets to configure making the configuration fail if a user tries to is the reasonable approach.

Making the build do something different and unintuitive from what the configuration requested seems undesirable to me.

I think this change makes configuration confusingly inaccurate. Since the compiler-rt Darwin build does not support picking and choosing which targets to configure making the configuration fail if a user tries to is the reasonable approach.

Making the build do something different and unintuitive from what the configuration requested seems undesirable to me.

I'm not sure I understand.

I completely agree with you that it's misleading if e.g. a user specifies x86_64-apple-macosx and arm64-apple-ios in their LLVM_BUILTIN_TARGETS, and expects each of those configurations to only build the listed target. That's not how the Darwin compiler-rt build works.

However, x86_64-apple-darwin is more of a generic Darwin target. If someone understands that and just wants to add x86_64-apple-darwin to their LLVM_BUILTIN_TARGETS (and no other Apple targets) in order to build the builtins for all Apple targets, I don't see why we should disallow that.

I would argue that our current situation is actually misleading. You can add x86_64-apple-macosx to LLVM_BUILTIN_TARGETS without any configuration errors, and then you'll end up building a libclang_rt.ios.a that actually contains object files for macOS. Like I said, I'd be supportive of making this codepath complain if you specified a triple for a specific Apple platform, but I think the generic darwin triple should be fine.

Let me add a few more safeguards to this and see what you think. In particular, I'm even happy for darwin triples to still be an error by default, but to have a CMake option that one can set if they understand how the Darwin compiler-rt build works. (Triples for specific Apple platforms would always be an error.)

I agree that the current situation is not optimal, I was never happy with the default case which is confusing for a lot of people I talked to. I would like to come up with a better solution, but I'm not sure if special casing the x86_64-apple-darwin is sufficient though. With Apple Silicon, I assume we're going to be building runtimes as universal binaries that target multiple architectures.

We could consider completely decoupling runtimes from targets. Originally, each runtime build was identified only by its target, so you'd use RUNTIMES_${target}_${variable} to set target-specific variables. That became insufficient when we introduced the support for multilib, so now you also have a name and you can use RUNTIMES_${target}+${name}_${variable}. We could switch to RUNTIMES_${name}_${variable} and it'd be the responsibility of the user to set the target explicitly if they want it (FWIW it's something we already have to do in our build anyway because of Windows where the CMake behavior when you set CMAKE_${LANG}_COMPILER_TARGET breaks the build). If not, they get the default behavior, matching the default case today. What do you think?

smeenai updated this revision to Diff 286899.Aug 20 2020, 2:11 PM

Add explicit error and flag

Okay, I would argue that this version of the change is much better than the current situation. In particular, it disallows the loophole of specifying a triple for a particular platform instead of "darwin", and it forces users of "darwin" to acknowledge they understand how the Darwin compiler-rt build works by setting a particular variable (and it still disallows multiple Darwin triples). Lemme know what you think.

Should this be enforced for LLVM_RUNTIME_TARGETS in addition to LLVM_BUILTIN_TARGETS?

I agree that the current situation is not optimal, I was never happy with the default case which is confusing for a lot of people I talked to. I would like to come up with a better solution, but I'm not sure if special casing the x86_64-apple-darwin is sufficient though. With Apple Silicon, I assume we're going to be building runtimes as universal binaries that target multiple architectures.

We could consider completely decoupling runtimes from targets. Originally, each runtime build was identified only by its target, so you'd use RUNTIMES_${target}_${variable} to set target-specific variables. That became insufficient when we introduced the support for multilib, so now you also have a name and you can use RUNTIMES_${target}+${name}_${variable}. We could switch to RUNTIMES_${name}_${variable} and it'd be the responsibility of the user to set the target explicitly if they want it (FWIW it's something we already have to do in our build anyway because of Windows where the CMake behavior when you set CMAKE_${LANG}_COMPILER_TARGET breaks the build). If not, they get the default behavior, matching the default case today. What do you think?

To clarify, what would be setting ${name} in this scenario?

I agree that the current situation is not optimal, I was never happy with the default case which is confusing for a lot of people I talked to. I would like to come up with a better solution, but I'm not sure if special casing the x86_64-apple-darwin is sufficient though. With Apple Silicon, I assume we're going to be building runtimes as universal binaries that target multiple architectures.

We could consider completely decoupling runtimes from targets. Originally, each runtime build was identified only by its target, so you'd use RUNTIMES_${target}_${variable} to set target-specific variables. That became insufficient when we introduced the support for multilib, so now you also have a name and you can use RUNTIMES_${target}+${name}_${variable}. We could switch to RUNTIMES_${name}_${variable} and it'd be the responsibility of the user to set the target explicitly if they want it (FWIW it's something we already have to do in our build anyway because of Windows where the CMake behavior when you set CMAKE_${LANG}_COMPILER_TARGET breaks the build). If not, they get the default behavior, matching the default case today. What do you think?

To clarify, what would be setting ${name} in this scenario?

It'd be set by user. Today we have BUILTIN_TARGETS and RUNTIME_TARGETS, we could either repurpose those and create new variables. You'd use it as follows:

list(APPEND BUILTIN_TARGETS "darwin")
set(BUILTINS_darwin_CMAKE_SYSTEM_NAME Darwin CACHE STRING "")
...
list(APPEND BUILTIN_TARGETS "linux")
set(BUILTINS_linux_CMAKE_SYSTEM_NAME Linux CACHE STRING "")
set(BUILTINS_linux_CMAKE_C_COMPILER_TARGET x86_64-linux-gnu CACHE STRING "")
...

Here darwin and linux are arbitrary names.

smeenai updated this revision to Diff 286909.Aug 20 2020, 3:30 PM

Check LLVM_RUNTIME_TARGETS as well

I agree that the current situation is not optimal, I was never happy with the default case which is confusing for a lot of people I talked to. I would like to come up with a better solution, but I'm not sure if special casing the x86_64-apple-darwin is sufficient though. With Apple Silicon, I assume we're going to be building runtimes as universal binaries that target multiple architectures.

We could consider completely decoupling runtimes from targets. Originally, each runtime build was identified only by its target, so you'd use RUNTIMES_${target}_${variable} to set target-specific variables. That became insufficient when we introduced the support for multilib, so now you also have a name and you can use RUNTIMES_${target}+${name}_${variable}. We could switch to RUNTIMES_${name}_${variable} and it'd be the responsibility of the user to set the target explicitly if they want it (FWIW it's something we already have to do in our build anyway because of Windows where the CMake behavior when you set CMAKE_${LANG}_COMPILER_TARGET breaks the build). If not, they get the default behavior, matching the default case today. What do you think?

To clarify, what would be setting ${name} in this scenario?

It'd be set by user. Today we have BUILTIN_TARGETS and RUNTIME_TARGETS, we could either repurpose those and create new variables. You'd use it as follows:

list(APPEND BUILTIN_TARGETS "darwin")
set(BUILTINS_darwin_CMAKE_SYSTEM_NAME Darwin CACHE STRING "")
...
list(APPEND BUILTIN_TARGETS "linux")
set(BUILTINS_linux_CMAKE_SYSTEM_NAME Linux CACHE STRING "")
set(BUILTINS_linux_CMAKE_C_COMPILER_TARGET x86_64-linux-gnu CACHE STRING "")
...

Here darwin and linux are arbitrary names.

Got it. That could be useful in general, although we don't use the multilib setup just yet, so I don't have experience with that. We'd also need to publicize such a change, since people are probably used to the triple being implied.

@beanz had given me a LGTM on IRC (with the addition of the check for LLVM_RUNTIME_TARGETS as well). I'll wait a bit more if there's other comments. (I'll also update the commit message to reflect the diff as it is now.)

phosek accepted this revision.Aug 20 2020, 5:25 PM

LGTM

This revision is now accepted and ready to land.Aug 20 2020, 5:25 PM
This revision was automatically updated to reflect the committed changes.