This is an archive of the discontinued LLVM Phabricator instance.

Disable compiler launcher on external projects and multi stage clang
ClosedPublic

Authored by haowei on Feb 22 2023, 4:55 PM.

Details

Summary

When using compiler caching programs like ccache, there is no point to use them on external projects or multi-stage clang builds. As these builds use fresh from source code toolchain and will pollute the build cache. If a compiler launcher is still required, a user can explicitly define CMAKE_C_COMPILER_LAUNCHER and CMAKE_CXX_COMPILER_LAUNCHER in CLANG_BOOTSTRAP_PASSTHROUGH and LLVM_EXTERNAL_PROJECT_PASSTHROUGHflags to enable compiler launcher in these builds.

Diff Detail

Event Timeline

haowei created this revision.Feb 22 2023, 4:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2023, 4:55 PM
Herald added a subscriber: abrachet. · View Herald Transcript
haowei requested review of this revision.Feb 22 2023, 4:55 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFeb 22 2023, 4:55 PM

A more generic version might be to have a list of variables to (not) passthrough.

First option is to remove {C,CXX}_COMPILER_LAUNCHER from _BOOTSTRAP_DEFAULT_PASSTHROUGH and then include it as needed by setting something like CLANG_BOOTSTRAP_EXTRA_PASSTHROUGH=C_COMPILER_LAUNCHER;CXX_COMPILER_LAUNCHER.
Second option is to provide a way to filter out variable from passthrough, for example CLANG_BOOTSTRAP_NO_PASSTHROUGH=C_COMPILER_LAUNCHER;CXX_COMPILER_LAUNCHER.

I slightly prefer the first option, but it'd be a breaking change, so the second option might be the only feasible one.

llvm/cmake/modules/LLVMExternalProjectUtils.cmake
318–323

I think it's better to avoid setting these at all if disabled rather than setting them to an empty string. You can just append these to cmake_args.

@beanz @smeenai do you have any preferences/other ideas?

Hmm, what cache key does ccache use for compilers? Is it the --version output string, the path, or some combination? If the path is involved then I don't see any value in using ccache for configuring anything past stage1, since the compiler will (presumably) be installed somewhere else afterwards and not used directly from the build tree. If it's just --version, what are the cache pollution concerns? The compiler binaries output by all the stages should produce the same outputs given the same inputs (and presumably have the same --version), right?

Hmm, what cache key does ccache use for compilers? Is it the --version output string, the path, or some combination? If the path is involved then I don't see any value in using ccache for configuring anything past stage1, since the compiler will (presumably) be installed somewhere else afterwards and not used directly from the build tree. If it's just --version, what are the cache pollution concerns? The compiler binaries output by all the stages should produce the same outputs given the same inputs (and presumably have the same --version), right?

I am not sure about ccache. We use an internal distributed build cache system when we build LLVM. The compiler binary hash and the --version output will both be used to calculate the cache key so there is little value to use caching past stage1 as the runtime build and stage2+ will be built from a fresh from source compiler and it will always result in cache miss.

I don't believe ccache only uses --version output as cache key since two compiler binary with the same version string can have different output (e.g. 2 WIP source tree but track the same git revision).

So think it would be beneficial to add a flag to avoid using compiler launcher past stage1.

Hmm, what cache key does ccache use for compilers? Is it the --version output string, the path, or some combination? If the path is involved then I don't see any value in using ccache for configuring anything past stage1, since the compiler will (presumably) be installed somewhere else afterwards and not used directly from the build tree. If it's just --version, what are the cache pollution concerns? The compiler binaries output by all the stages should produce the same outputs given the same inputs (and presumably have the same --version), right?

I am not sure about ccache. We use an internal distributed build cache system when we build LLVM. The compiler binary hash and the --version output will both be used to calculate the cache key so there is little value to use caching past stage1 as the runtime build and stage2+ will be built from a fresh from source compiler and it will always result in cache miss.

I don't believe ccache only uses --version output as cache key since two compiler binary with the same version string can have different output (e.g. 2 WIP source tree but track the same git revision).

So think it would be beneficial to add a flag to avoid using compiler launcher past stage1.

Yeah, that sounds good. I also prefer @phosek's first option of letting users opt in to compiler launchers for boostrapping builds if they want, since I don't see a lot of value in doing that in general, but being conservative and having an option for variables to not pass through is likely the most expedient approach here, and that works for me.

I agree, in that case let's remove C_COMPILER_LAUNCHER;CXX_COMPILER_LAUNCHER from the list of default passthrough variables which seems like a reasonable default, and provide CLANG_BOOTSTRAP_EXTRA_PASSTHROUGH so developers have a way to pass these through to the next stage if they want to.

haowei added a comment.Mar 1 2023, 1:53 PM

I agree, in that case let's remove C_COMPILER_LAUNCHER;CXX_COMPILER_LAUNCHER from the list of default passthrough variables which seems like a reasonable default, and provide CLANG_BOOTSTRAP_EXTRA_PASSTHROUGH so developers have a way to pass these through to the next stage if they want to.

There is already a CLANG_BOOTSTRAP_PASSTHROUGH flag https://github.com/llvm/llvm-project/blob/dc1c8917afd3f2b306797890a56be66087feb832/clang/CMakeLists.txt#L749 that works in the way you describe the CLANG_BOOTSTRAP_EXTRA_PASSTHROUGH flag.

I think remove the compiler launcher from default pass through flags are fine. What about the using compiler launcher in runtime builds? I don't think we should read CLANG_BOOTSTRAP_PASSTHROUGH to determine if we should pass through compiler launcher flags.

haowei updated this revision to Diff 501670.Mar 1 2023, 2:40 PM
haowei marked an inline comment as done.

I think remove the compiler launcher from default pass through flags are fine. What about the using compiler launcher in runtime builds? I don't think we should read CLANG_BOOTSTRAP_PASSTHROUGH to determine if we should pass through compiler launcher flags.

I'd suggest a similar approach, that is remove C_COMPILER_LAUNCHER and CXX_COMPILER_LAUNCHER from the default passthrough list and provide a new LLVM_EXTERNAL_PROJECT_PASSTHROUGH variable.

I think remove the compiler launcher from default pass through flags are fine. What about the using compiler launcher in runtime builds? I don't think we should read CLANG_BOOTSTRAP_PASSTHROUGH to determine if we should pass through compiler launcher flags.

I'd suggest a similar approach, that is remove C_COMPILER_LAUNCHER and CXX_COMPILER_LAUNCHER from the default passthrough list and provide a new LLVM_EXTERNAL_PROJECT_PASSTHROUGH variable.

Will the PASSTHROUGH approach let us choose a different LAUNCHER for the external projects? For example, we could still distribute those compiles, albeit without expecting any caching.

phosek added a comment.Mar 3 2023, 1:40 AM

I think remove the compiler launcher from default pass through flags are fine. What about the using compiler launcher in runtime builds? I don't think we should read CLANG_BOOTSTRAP_PASSTHROUGH to determine if we should pass through compiler launcher flags.

I'd suggest a similar approach, that is remove C_COMPILER_LAUNCHER and CXX_COMPILER_LAUNCHER from the default passthrough list and provide a new LLVM_EXTERNAL_PROJECT_PASSTHROUGH variable.

Will the PASSTHROUGH approach let us choose a different LAUNCHER for the external projects? For example, we could still distribute those compiles, albeit without expecting any caching.

No, we would need a different option for that. I'd start with the passthrough approach which is generic and then look into the possibility of overriding the launcher for subprojects if there's a compelling use case.

haowei updated this revision to Diff 502841.Mar 6 2023, 3:50 PM
haowei retitled this revision from Add option to disable compiler launcher on external projects to Disable compiler launcher on external projects and multi stage clang.
haowei edited the summary of this revision. (Show Details)

What's the current status of this changeset?

phosek accepted this revision.Apr 24 2023, 11:17 AM

LGTM

This revision is now accepted and ready to land.Apr 24 2023, 11:17 AM
This revision was landed with ongoing or failed builds.Apr 24 2023, 12:16 PM
This revision was automatically updated to reflect the committed changes.