This is an archive of the discontinued LLVM Phabricator instance.

[Driver] Support Debian multiarch style lib/clang/14.0.0/x86_64-linux-gnu runtime path and include/x86_64-linux-gnu/c++/v1 libc++ path
AbandonedPublic

Authored by MaskRay on Sep 28 2021, 3:14 PM.

Details

Summary

This is needed when config.guess is removed (D109837) and
-DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on is enabled.

LLVM_DEFAULT_TARGET_TRIPLE is recommended to be the Debian multiarch style
"x86_64-linux-gnu" (gcc -dumpmachine output). CMake will install runtime
libraries to lib/clang/14.0.0/x86_64-linux-gnu/, but clang --print-runtime-dir
detects lib/clang/14.0.0/x86_64-unknown-linux-gnu/ where
"x86_64-unknown-linux-gnu" is a normalized triple and fails to find the library.

To fix this, respect --target= (which defaults to LLVM_DEFAULT_TARGET_TRIPLE)
and a small set of GCC multilib style triple transformation (currently probably
just -m32 and -m64, otherwise *do no normalization*. I don't think we
support -mx32 or MIPS/RISC-V's fancy multilib combinations) to construct
the multiarch search path.

To this end, add UnnormalizedTriple to computeTargetTriple and add arch
transformation code. Then update getRuntimePath/getStdlibPath/addLibCxxIncludePaths
to use the unnormalized triple.

It would probably be nice to pass the unnormalized triple to getToolChain, but
currently it would result in numerous issues, because a unnormalized triple has
wrong OS/environment and can confuse driver code in many places.

Tested the following commands succeed.

-DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-unknown-linux-gnu
/tmp/out/custom0/bin/clang++ --stdlib=libc++ -fsanitize=address a.cc
/tmp/out/custom0/bin/clang++ --target=x86_64-linux-gnu --stdlib=libc++ -fsanitize=address a.cc  # this is a hack
/tmp/out/custom0/bin/clang++ --target=x86_64-unknown-linux-gnu --stdlib=libc++ -fsanitize=address a.cc

-DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-linux-gnu
/tmp/out/custom1/bin/clang++ --stdlib=libc++ -fsanitize=address a.cc
/tmp/out/custom1/bin/clang++ --target=x86_64-linux-gnu --stdlib=libc++ -fsanitize=address a.cc

For Debian, using --target=x86_linux-gnu with -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-unknown-linux-gnu is a hack.
The preferred fix is to default LLVM_DEFAULT_TARGET_TRIPLE to x86_64-linux-gnu:
see https://discourse.llvm.org/t/rfc-fix-loose-behaviors-of-clang-target/60272

Diff Detail

Event Timeline

MaskRay created this revision.Sep 28 2021, 3:14 PM
MaskRay requested review of this revision.Sep 28 2021, 3:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 28 2021, 3:14 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

just to make sure I understand, it will also update the install or "just" the dumpmachine option ?

MaskRay updated this revision to Diff 375722.Sep 28 2021, 3:25 PM

add missing file

MaskRay added a comment.EditedSep 28 2021, 3:28 PM

just to make sure I understand, it will also update the install or "just" the dumpmachine option ?

If LLVM_DEFAULT_TARGET_TRIPLE is x86_64-linux-gnu, clang -dumpmachine will still be the normalized (by Driver.cpp:computeTargetTriple) x86_64-unknown-linux-gnu.

This patch just allows clang to detect lib/clang/14.0.0/x86_64-linux-gnu which contains libclang_rt.builtins.a (used by `--rtlib=compiler-rt), and more common, asan/msan/etc runtime archives.

smeenai added inline comments.
clang/lib/Driver/ToolChain.cpp
500

This is effectively reverting the change to this function made in D101194 (and amended in rG36430d44edba), except it prefers the normalized triple over the unnormalized triple, right? Does getStdlibPath need to be changed as well?

This doesn't help the world-breaking D107799 on debian for me.
I don't know if this stands on it's own.

jrtc27 added a subscriber: jrtc27.Sep 28 2021, 5:16 PM

NB: summary has x86_64-known rather than x86_64-unknown in a couple of places

MaskRay updated this revision to Diff 375751.Sep 28 2021, 5:41 PM
MaskRay marked an inline comment as done.
MaskRay retitled this revision from [Driver] Support Debian multiarch style lib/clang/14.0.0/x86_64-linux-gnu runtime path to [Driver] Support Debian multiarch style lib/clang/14.0.0/x86_64-linux-gnu runtime path and include/x86_64-linux-gnu/c++/v1 libc++ path.
MaskRay edited the summary of this revision. (Show Details)

Fix libc++ include path as well
Prefer unnormalized over normalized (@smeenai)
Fix getStdlibPath as well

The reason I removed this behavior in D101194, aside from extra overhead introduced by the extra checks, is that we've seen cases where people would have both paths on their system which lead to difficult to diagnose issues where Clang would pick up a library different from the one they expected. I'd therefore prefer supporting only a single spelling of the triple that could be configured, and making sure that Clang build on Debian uses the right configuration.

MaskRay updated this revision to Diff 375753.Sep 28 2021, 5:51 PM

fix getRuntimePath
test libclang_rt.builtins.a

Prefer unnormalized over normalized (@smeenai)

To be clear, I wasn't advocating for one ordering over the other, just pointing out the similarity to the prior state of things :) @phosek would know what the right ordering is.

A potential solution would be to reintroduce the portion D101194 which would let each driver control the triple spelling (by overriding getMultiarchTriple. On Linux, including Debian, that logic is already implemented in https://github.com/llvm/llvm-project/blob/7255ce30e48feb07e4e82613f518683fbc247c1c/clang/lib/Driver/ToolChains/Linux.cpp#L40, and on all other platforms this could simply return TargetTriple.str() as was done in D101194.

MaskRay added a comment.EditedSep 28 2021, 5:59 PM

The reason I removed this behavior in D101194, aside from extra overhead introduced by the extra checks, is that we've seen cases where people would have both paths on their system which lead to difficult to diagnose issues where Clang would pick up a library different from the one they expected. I'd therefore prefer supporting only a single spelling of the triple that could be configured, and making sure that Clang build on Debian uses the right configuration.

The problem is that when LLVM_DEFAULT_TARGET_TRIPLE is "x86_64-linux-gnu" (no vendor part), CMake respects it (instead of adding unknown) and installs files to lib/clang/14.0.0/x86_64-linux-gnu/.
Currently clang driver only inspects the normalized lib/clang/14.0.0/x86_64-unknown-linux-gnu/ and cannot find it.

I think D101194 caused the workaround used by @sylvestre.ledru: symlink /usr/lib/llvm-14/lib/x86_64-linux-gnu -> x86_64-pc-linux-gnu
(I am quite confused by the pc part. That looks like a Debian configuration issue.)

A potential solution would be to reintroduce the portion D101194 which would let each driver control the triple spelling (by overriding getMultiarchTriple. On Linux, including Debian, that logic is already implemented in https://github.com/llvm/llvm-project/blob/7255ce30e48feb07e4e82613f518683fbc247c1c/clang/lib/Driver/ToolChains/Linux.cpp#L40, and on all other platforms this could simply return TargetTriple.str() as was done in D101194.

On many Debian/Ubuntu derivatives, Distro.IsDebian() || Distro.IsUbuntu() is false but they use the Debian multiarch hierarchy. How to make these distributions work?
For non-Debian distros (e.g. Arch Linux), I'd still prefer we use the GCC vanilla x86_64-unknown-linux-gnu.

The reason I removed this behavior in D101194, aside from extra overhead introduced by the extra checks, is that we've seen cases where people would have both paths on their system which lead to difficult to diagnose issues where Clang would pick up a library different from the one they expected. I'd therefore prefer supporting only a single spelling of the triple that could be configured, and making sure that Clang build on Debian uses the right configuration.

The problem is that when LLVM_DEFAULT_TARGET_TRIPLE is "x86_64-linux-gnu" (no vendor part), CMake respects it (instead of adding unknown) and installs files to lib/clang/14.0.0/x86_64-linux-gnu/.
Currently clang driver only inspects the normalized lib/clang/14.0.0/x86_64-unknown-linux-gnu/ and cannot find it.

I think D101194 caused the workaround used by @sylvestre.ledru: symlink /usr/lib/llvm-14/lib/x86_64-linux-gnu -> x86_64-pc-linux-gnu
(I am quite confused by the pc part. That looks like a Debian configuration issue.)

A potential solution would be to reintroduce the portion D101194 which would let each driver control the triple spelling (by overriding getMultiarchTriple. On Linux, including Debian, that logic is already implemented in https://github.com/llvm/llvm-project/blob/7255ce30e48feb07e4e82613f518683fbc247c1c/clang/lib/Driver/ToolChains/Linux.cpp#L40, and on all other platforms this could simply return TargetTriple.str() as was done in D101194.

On many Debian/Ubuntu derivatives, Distro.IsDebian() || Distro.IsUbuntu() is false but they use the Debian multiarch hierarchy. How to make these distributions work?
For non-Debian distros (e.g. Arch Linux), I'd still prefer we use the GCC vanilla x86_64-unknown-linux-gnu.

x86 is unusual; the default vendor, at least in the GNU world, is pc, not unknown (e.g. config.guess x86_64-linux-gnu prints x86_64-pc-linux-gnu).

MaskRay added a comment.EditedSep 28 2021, 6:07 PM

Let Debian/Ubuntu/... use LLVM_DEFAULT_TARGET_TRIPLE=*-linux-gnu (gcc -dumpmachine does not have the vendor part)
Non-Debian glibc distros use LLVM_DEFAULT_TARGET_TRIPLE=*-{unknown,pc}-linux-gnu. (gcc -dumpmachine has the vendor part). Non-Debian glibc distros using x86_64-linux-gnu is fine, that will just cost them some unneeded filesystem stats.

Currently, for libstdc++ and sysroot include and library paths, Clang Driver has to guess whether the Debian multiarch or vanilla GCC multiarch hierarchy is used.
The two hierarchies have some overlap in some edge cases, so the code is quite tricky.
We end up with lots of getVFS().exists and addPathIfExists.

By having Debian and non-Debian using different LLVM_DEFAULT_TARGET_TRIPLE, we can make better separation.
*-{unknown,pc,suse,etc}-linux-gnu users don't need to detect Debian multiarch paths at all.

The hard-coded Debian multiarch triples in Linux::getMultiarchTriple can be deleted, since we expect LLVM_DEFAULT_TARGET_TRIPLE to provide the correct value.
For cross compilation (targeting Debian) users, we expect them to specify Debian multiarch style --target=aarch64-linux-gnu (aarch64-pc-linux-gnu won't work).
If some multilib transformation (e.g. gnueabi -> gnueabihf) is useful, we can make that code generic, instead of relying on Debian multiarch triples.

The current getStdlibPath/getRuntimePath complexity (detecting two paths) is temporary.
In the future, we can try detecting only one.

x86 is unusual; the default vendor, at least in the GNU world, is pc, not unknown (e.g. config.guess x86_64-linux-gnu prints x86_64-pc-linux-gnu).

OK. llvm-project's old config.guess uses unknown.

% sh llvm/cmake/config.guess       
x86_64-unknown-linux-gnu
% sh /tmp/c/config.guess   # 'https://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD'
x86_64-pc-linux-gnu

The reason I removed this behavior in D101194, aside from extra overhead introduced by the extra checks, is that we've seen cases where people would have both paths on their system which lead to difficult to diagnose issues where Clang would pick up a library different from the one they expected. I'd therefore prefer supporting only a single spelling of the triple that could be configured, and making sure that Clang build on Debian uses the right configuration.

The problem is that when LLVM_DEFAULT_TARGET_TRIPLE is "x86_64-linux-gnu" (no vendor part), CMake respects it (instead of adding unknown) and installs files to lib/clang/14.0.0/x86_64-linux-gnu/.
Currently clang driver only inspects the normalized lib/clang/14.0.0/x86_64-unknown-linux-gnu/ and cannot find it.

I ran into this issue the past and the solution I considered would be to modify the CMake build of runtimes to query Clang for the triple (using -dumpmachine) and use that instead of LLVM_DEFAULT_TARGET_TRIPLE. I still think that change would be desirable for consistency and would be happy to implement it.

The reason I removed this behavior in D101194, aside from extra overhead introduced by the extra checks, is that we've seen cases where people would have both paths on their system which lead to difficult to diagnose issues where Clang would pick up a library different from the one they expected. I'd therefore prefer supporting only a single spelling of the triple that could be configured, and making sure that Clang build on Debian uses the right configuration.

The problem is that when LLVM_DEFAULT_TARGET_TRIPLE is "x86_64-linux-gnu" (no vendor part), CMake respects it (instead of adding unknown) and installs files to lib/clang/14.0.0/x86_64-linux-gnu/.
Currently clang driver only inspects the normalized lib/clang/14.0.0/x86_64-unknown-linux-gnu/ and cannot find it.

I ran into this issue the past and the solution I considered would be to modify the CMake build of runtimes to query Clang for the triple (using -dumpmachine) and use that instead of LLVM_DEFAULT_TARGET_TRIPLE. I still think that change would be desirable for consistency and would be happy to implement it.

If the host compiler is GCC (gcc -dumpmachine => x86_64-linux-gnu on Debian amd64), how to query Clang for the triple?

My previous comment has been edited. I believe by making Debian derivatives stick with *-linux-gnu, we can actually remove more code from the driver.

The reason I removed this behavior in D101194, aside from extra overhead introduced by the extra checks, is that we've seen cases where people would have both paths on their system which lead to difficult to diagnose issues where Clang would pick up a library different from the one they expected. I'd therefore prefer supporting only a single spelling of the triple that could be configured, and making sure that Clang build on Debian uses the right configuration.

The problem is that when LLVM_DEFAULT_TARGET_TRIPLE is "x86_64-linux-gnu" (no vendor part), CMake respects it (instead of adding unknown) and installs files to lib/clang/14.0.0/x86_64-linux-gnu/.
Currently clang driver only inspects the normalized lib/clang/14.0.0/x86_64-unknown-linux-gnu/ and cannot find it.

I ran into this issue the past and the solution I considered would be to modify the CMake build of runtimes to query Clang for the triple (using -dumpmachine) and use that instead of LLVM_DEFAULT_TARGET_TRIPLE. I still think that change would be desirable for consistency and would be happy to implement it.

If the host compiler is GCC (gcc -dumpmachine => x86_64-linux-gnu on Debian amd64), how to query Clang for the triple?

You mean when you build runtimes with the host compiler and then use them with Clang (i.e. by using -DLLVM_ENABLE_PROJECTS="clang;lld;libcx")? This has been discussed several times on llvm-dev in the past, ideally nobody should be using this mode because there's no guarantee that the host compiler is ABI compatible with Clang. Unfortunately, I believe that some projects still build that way and it might take a while to migrate them to another approach.

My previous comment has been edited. I believe by making Debian derivatives stick with *-linux-gnu, we can actually remove more code from the driver.

Can you elaborate on it? What solution do you have in mind?

You mean when you build runtimes with the host compiler and then use them with Clang (i.e. by using -DLLVM_ENABLE_PROJECTS="clang;lld;libcx")? This has been discussed several times on llvm-dev in the past, ideally nobody should be using this mode because there's no guarantee that the host compiler is ABI compatible with Clang. Unfortunately, I believe that some projects still build that way and it might take a while to migrate them to another approach.

I think many groups will object to disallowing LLVM_ENABLE_PROJECTS for libcxx/libcxxabi/libunwind.
They can not be unsupported anytime soon.
For example, one step of a msan build uses -DLLVM_ENABLE_PROJECTS='libcxx;libcxxabi'.

My previous comment has been edited. I believe by making Debian derivatives stick with *-linux-gnu, we can actually remove more code from the driver.

Can you elaborate on it? What solution do you have in mind?

See updated https://reviews.llvm.org/D110663#3029138

MaskRay added a comment.EditedSep 28 2021, 7:48 PM

On top on this Diff, with the following patch, ninja check-clang-driver passes. We may consider dropping the normalized triple code in the future.

--- i/clang/lib/Driver/ToolChain.cpp
+++ w/clang/lib/Driver/ToolChain.cpp
@@ -494,6 +494,7 @@ std::string ToolChain::getRuntimePath() const {
   // "x86_64-linux-gnu" (no vendor part).
   SmallString<128> P;
   llvm::sys::path::append(P, D.ResourceDir, "lib", D.getTargetTriple());
+  return std::string(P);
   if (getVFS().exists(P))
     return std::string(P);
 
@@ -506,6 +507,7 @@ std::string ToolChain::getStdlibPath() const {
   SmallString<128> P;
   // First try the unnormalized triple a la getRuntimePath().
   llvm::sys::path::append(P, D.Dir, "..", "lib", D.getTargetTriple());
+  return std::string(P);
   if (getVFS().exists(P))
     return std::string(P);
MaskRay updated this revision to Diff 375778.Sep 28 2021, 9:15 PM
MaskRay edited the summary of this revision. (Show Details)

Support Debian -m32

This is going to break Fuchsia as implemented. We assume that {x86_64,aarch64}-unknown-fuchsia and {x86_64,aarch64}-fuchsia behave identically and developers are allowed to use whichever spelling they prefer. With this change that's no longer true. This may break other platforms that make similar assumptions.

MaskRay added a comment.EditedOct 4 2021, 5:54 PM

This is going to break Fuchsia as implemented. We assume that {x86_64,aarch64}-unknown-fuchsia and {x86_64,aarch64}-fuchsia behave identically and developers are allowed to use whichever spelling they prefer. With this change that's no longer true. This may break other platforms that make similar assumptions.

Are Fuchsia developers specifying --target= explicitly? I think that's the direction we don't want to support.
I can special case Fuchsia to use the normalized triple, but that's code that I want to avoid.

From my understanding of D110900, if a Linux distro prefers *-linux to *-linux-gnu, we should drop --target= normalization to make this available.


I think clang driver does normalization so that OS and environment checks can work , otherwise the constructor Triple::Triple can be fooled to use unknown.
This does not mean that we should use normalized paths for include and library paths that users don't explicitly specify.

Some ideas that normalization may not be great. Different platforms may prefer different normalization styles. For example, llvm-project's old config.guess may prefer the unknown vendor while newer config.guess prefers the pc vendor. redhat may prefer *-linux instead of *-linux-gnu. If we need to do normalization for include and library paths, we need to encode many rules. If we avoid normalization we can save much code and be less surprising.

This is going to break Fuchsia as implemented. We assume that {x86_64,aarch64}-unknown-fuchsia and {x86_64,aarch64}-fuchsia behave identically and developers are allowed to use whichever spelling they prefer. With this change that's no longer true. This may break other platforms that make similar assumptions.

Are Fuchsia developers specifying --target= explicitly? I think that's the direction we don't want to support.

Within Fuchsia source tree, we're trying use normalized triple (that is ${arch}-unknown-fuchsia) but I'm aware of other projects that target Fuchsia and use different spellings. For example, Rust uses ${arch}-fuchsia, but they also use ${arch}-unknown-linux-gnu (rather than ${arch}-linux-gnu).

I can special case Fuchsia to use the normalized triple, but that's code that I want to avoid.

I don't think this issue is specific to just Fuchsia. Rather it seems to me like the problem is the inconsistent handling of triples on Linux and I'd like to avoid introducing workarounds for other platforms.

From my understanding of D110900, if a Linux distro prefers *-linux to *-linux-gnu, we should drop --target= normalization to make this available.


I think clang driver does normalization so that OS and environment checks can work , otherwise the constructor Triple::Triple can be fooled to use unknown.
This does not mean that we should use normalized paths for include and library paths that users don't explicitly specify.

Some ideas that normalization may not be great. Different platforms may prefer different normalization styles. For example, llvm-project's old config.guess may prefer the unknown vendor while newer config.guess prefers the pc vendor. redhat may prefer *-linux instead of *-linux-gnu. If we need to do normalization for include and library paths, we need to encode many rules. If we avoid normalization we can save much code and be less surprising.

That's why I'd prefer the idea I implemented in D101194 where we let each driver handle the normalization rather than trying to come with a single universal way that apparently doesn't exist.

We would also provide a default that would be llvm::Triple::normalize so only the platforms that want to deviate from that would need to implement a custom logic.

jdenny added a subscriber: jdenny.Oct 6 2021, 8:43 AM
MaskRay added a comment.EditedOct 6 2021, 9:27 AM

This is going to break Fuchsia as implemented. We assume that {x86_64,aarch64}-unknown-fuchsia and {x86_64,aarch64}-fuchsia behave identically and developers are allowed to use whichever spelling they prefer. With this change that's no longer true. This may break other platforms that make similar assumptions.

Are Fuchsia developers specifying --target= explicitly? I think that's the direction we don't want to support.

Within Fuchsia source tree, we're trying use normalized triple (that is ${arch}-unknown-fuchsia) but I'm aware of other projects that target Fuchsia and use different spellings. For example, Rust uses ${arch}-fuchsia, but they also use ${arch}-unknown-linux-gnu (rather than ${arch}-linux-gnu).

I can special case Fuchsia to use the normalized triple, but that's code that I want to avoid.

I don't think this issue is specific to just Fuchsia. Rather it seems to me like the problem is the inconsistent handling of triples on Linux and I'd like to avoid introducing workarounds for other platforms.

From my understanding of D110900, if a Linux distro prefers *-linux to *-linux-gnu, we should drop --target= normalization to make this available.


I think clang driver does normalization so that OS and environment checks can work , otherwise the constructor Triple::Triple can be fooled to use unknown.
This does not mean that we should use normalized paths for include and library paths that users don't explicitly specify.

Some ideas that normalization may not be great. Different platforms may prefer different normalization styles. For example, llvm-project's old config.guess may prefer the unknown vendor while newer config.guess prefers the pc vendor. redhat may prefer *-linux instead of *-linux-gnu. If we need to do normalization for include and library paths, we need to encode many rules. If we avoid normalization we can save much code and be less surprising.

That's why I'd prefer the idea I implemented in D101194 where we let each driver handle the normalization rather than trying to come with a single universal way that apparently doesn't exist.

We would also provide a default that would be llvm::Triple::normalize so only the platforms that want to deviate from that would need to implement a custom logic.

Now I re-read D101194, it did make simplification to code like ToolChain::getRuntimePath, but I may not call it moving into the desired state.

I think the best approach is to avoid normalization for include/library path purpose, then we don't even need getMultiarchTriple overloads at all and avoid all the platform customization.
This is necessary to prevent that platforms add more conversion like (D111207, which I think is moving toward the wrong direction).
The correct LLVM_DEFAULT_TARGET_TRIPLE compensates for dropping the customization. (We might need some conversion logic in a transition period.)

For example, Rust uses ${arch}-fuchsia, but they also use ${arch}-unknown-linux-gnu (rather than ${arch}-linux-gnu).

Such loosen usage is partly due to the traditional fuzzy behavior of clang driver Generic_GCC::GCCInstallationDetector::init but I am not sure we want to encourage it / support it in the future.
People keep adding more customization to override the previous (IMHO wrong) behavior.
Asking them to be strict and sticking with one triple is the right direction, can greatly simplify the code in clang driver, and make porting to new platforms (since they don't even need to add customization to clang driver at all!) easier.
For Rust's case, ${arch}-unknown-linux-gnu does not look good since newer config.guess prefers ${arch}-pc-linux-gnu now.

We have accrued sufficient triple and include/library path problems, and patches over patches just make the logic more complex and more difficult to solve.
I think we should focus on the desired state, not hold off changes just because "downstream is relying on some traditional behavior".
We strive for making transition as smooth as possible (e.g. we still have ${arch}-unknown-linux-gnu <=> ${arch}-linux-gnu conversion logic), but the Rust's target confusion can be handed over to them.

MaskRay updated this revision to Diff 378040.Oct 7 2021, 4:29 PM

Hard-code Fuchsia to use normalized triple

fuchsia.c uses --target=x86_64-fuchsia but expects to find x86_64-unknown-fuchsia path.

Linux should respect --target= and do no normalization.


Debian/Ubuntu are currently in an unfortunate case, and I think the patch is needed to fix it.

MaskRay added a comment.EditedOct 11 2021, 10:49 AM

Ping:)

Our runtime build story is unfortunate and we should fix it before making LLVM_ENABLE_PROJECTS='...;libcxx;libunwind' more dead.

I'm still worried that this solution as implemented is going to break existing users, including major one like Rust.

I'm not also not a fan of special casing individual targets like Fuchsia directly inside Driver, I believe that any per-target logic should live in the corresponding ToolChain subclass.

I'm still worried that this solution as implemented is going to break existing users, including major one like Rust.

I'm not also not a fan of special casing individual targets like Fuchsia directly inside Driver, I believe that any per-target logic should live in the corresponding ToolChain subclass.

Given the potential wide ranging impact of this change, it might be also worth starting a thread on llvm-dev to reach a broader audience.

I'm still worried that this solution as implemented is going to break existing users, including major one like Rust.

Well, it's better than calling the Project build deprecated (in favor of Runtime build) and making the situation unfortunate for major distributions like Debian/Ubuntu.

For downstream packages like Rust, there is currently no evidence for the breakage.

Moreover, llvm-project has always had the attitude: "downstream projects are on their own".
In reality we seldom exercise this right but if there are quirky things we reserve the right to ask them to fix.
(I have patched packages like Julia/Rust/ldc, so I understand that for each major version, some adaptation is unavoidable.)

I'm not also not a fan of special casing individual targets like Fuchsia directly inside Driver, I believe that any per-target logic should live in the corresponding ToolChain subclass.

I add the code more as a workaround. I hope Fuchsia can move away from the mix of x86_64-unknown-fuchsia and x86_64-fuchsia.
At the point when the mix is fixed, we can simply remove the special case.
The special case is for providing a transition period. Using subclasses adds more complexity to the driver code.

I am not sure why you are pinging this review often?! Maybe chat with people who gave you feedback before directly?!

Myself I wasted so much time because of D107799 that I am very reluctant to enable LLVM_ENABLE_PER_TARGET_RUNTIME_DIR again.
The mismatch of the triple / quadruple between the system triple and what have been decided in llvm ( x86_64-linux-gnu vs x86_64-pc-linux-gnu)

Moreover, llvm-project has always had the attitude: "downstream projects are on their own".

I am not sure it is a correct statement. llvm project is a bunch of people with different focus.
But in general, we are trying to be nice with the ecosystem and people who rely on this software.

MaskRay added a comment.EditedNov 23 2021, 10:48 AM

I am not sure why you are pinging this review often?! Maybe chat with people who gave you feedback before directly?!

https://llvm.org/docs/CodeReview.html "If it is not urgent, the common courtesy ping rate is one week." I did it less frequently than once per week.

Ping @phosek

Myself I wasted so much time because of D107799 that I am very reluctant to enable LLVM_ENABLE_PER_TARGET_RUNTIME_DIR again.
The mismatch of the triple / quadruple between the system triple and what have been decided in llvm ( x86_64-linux-gnu vs x86_64-pc-linux-gnu)

Moreover, llvm-project has always had the attitude: "downstream projects are on their own".

I am not sure it is a correct statement. llvm project is a bunch of people with different focus.
But in general, we are trying to be nice with the ecosystem and people who rely on this software.

Still correct. For courtesy I typically notify projects but many changes (especially IR and codegen changes) cannot really notify users simply because you cannot enumerate every user and
in some IR/codegen folks's minds this would unnecessarily slow down development.
For this case I have notified rust, which may be the most likely impacted user after we remove config.guess.
For this patch along, I think really not many will observe anything but Debian/Ubuntu and their derivatives will get improved. The mostly likely impacted user was probably just Fuchsia, but I have added a hack to exclude Fuchsia.

Were you able to actually reproduce the problem that lead to revert of D107799, or is this based on blind guesses?

I am not sure why you are pinging this review often?! Maybe chat with people who gave you feedback before directly?!

https://llvm.org/docs/CodeReview.html "If it is not urgent, the common courtesy ping rate is one week." I did it less frequently than once per week.

Ping @phosek

Myself I wasted so much time because of D107799 that I am very reluctant to enable LLVM_ENABLE_PER_TARGET_RUNTIME_DIR again.
The mismatch of the triple / quadruple between the system triple and what have been decided in llvm ( x86_64-linux-gnu vs x86_64-pc-linux-gnu)

Moreover, llvm-project has always had the attitude: "downstream projects are on their own".

I am not sure it is a correct statement. llvm project is a bunch of people with different focus.
But in general, we are trying to be nice with the ecosystem and people who rely on this software.

Still correct. For courtesy I typically notify projects but many changes (especially IR and codegen changes) cannot really notify users simply because you cannot enumerate every user and
in some IR/codegen folks's minds this would unnecessarily slow down development.
For this case I have notified rust, which may be the most likely impacted user. I think the mostly likely impacted user was probably just Fuchsia, but I have added a hack to exclude Fuchsia.

MaskRay added a comment.EditedNov 23 2021, 11:04 AM

Were you able to actually reproduce the problem that lead to revert of D107799, or is this based on blind guesses?

The patch will fix the "lib/clang/14.0.0/x86_64-unknown-linux-gnu/" issue on Debian/Ubuntu and their derivatives:
if Debian multiarch LLVM_DEFAULT_TARGET_TRIPLE=x86_64-linux-gnu is used, the -unknown should not be present.
I have noticed the issue on my Debian testing machine.

Whether this helps the revert of D107799 (-DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on): blind guess.

I was previously of the impression that -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on was critical to phosek/ldionne's "prefer runtimes build movement". But it turns out that it isn't, so I am not going to too actively pursue restoring D107799.

phosek added a subscriber: hvdijk.Jan 25 2022, 3:26 PM

Ping @phosek

I apologize about the belated response, but I wanted to test this change first given the potentially large blast radius.

To clarify, when I talk Fuchsia toolchain, I don't just mean toolchain targeting Fuchsia, but a Clang toolchain distribution our team manages that's also used by a number of other teams such Dart, Flutter, Pigweed to build code for variety of platforms including Linux.

In my testing, I found that these users aren't currently using a single uniform target triple spelling, I've seen both ${arch}-linux-gnu and ${arch}-unknown-linux-gnu used pretty liberally. So if we were to land this change today, it would break many of these users. We could clean up and unify all of these uses, but it's going to take some effort because it spans lots of projects. I'm also worried that there might be other Clang users in a similar situation who are not included on this change.

That's why I think there should be an RFC sent to cfe-dev, to both reach an agreement on whether this is a direction everyone agrees with, but also to raise the awareness. There's also a question whether this should be rolled out in a more gradual fashion (for example by starting with a warning first).

Furthermore, I'm also not sure about the motivation behind this change. The change summary mentions D109837, but @hvdijk said that this is not a blocker. Is there another reason for doing this?

clang/lib/Driver/Driver.cpp
494

I find the name UnnormalizedTriple a bit confusing as it may imply that there's some "un-normalization" process involved. I'd prefer using a TargetTriple to refer to "triple given by the user (via the --target= option)" and NormalizedTriple as "normalized triple".

1267–1270

I really think this logic belongs to clang::driver::toolchains::Fuchsia. Moving it there is going to require some changes to how toolchains are instantiated, and should thus be done as a separate change, but doing so could also help us cleanup other special case logic like that exists in Driver like https://github.com/llvm/llvm-project/blob/8ba9c794feb30cd969b9776c39873def10c51bff/clang/lib/Driver/Driver.cpp#L576.

MaskRay added a comment.EditedJan 25 2022, 4:27 PM

No worries! Thanks for the reply.

I use Debian and want it to work well and support the -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on direction.
Currently, -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-linux-gnu (Debian style multiarch triple, so no -unknown or -pc) does not work for libc++ and some runtime libraries.
This led to the revert of D107799 which I think really unfortunate because runtime builds encourage LLVM_ENABLE_PER_TARGET_RUNTIME_DIR, which de facto deviate from the regular builds and llvm-project has to support two hierarchies.


Here is -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-unknown-linux-gnu:

cmake -GNinja -Hllvm -B/tmp/out/custom1 -DCMAKE_BUILD_TYPE=Release -DCMAKE_CROSSCOMPILING=on -DCMAKE_INSTALL_PREFIX=/tmp/opt/aarch64 -DLLVM_TARGETS_TO_BUILD=AArch64 -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-unknown-linux-gnu -DLLVM_TARGET_ARCH=AArch64 -DLLVM_ENABLE_PROJECTS='clang;lld' -DLLVM_ENABLE_RUNTIMES='compiler-rt;l
ibcxx;libcxxabi;libunwind'
ninja -C /tmp/out/custom1 lld cxx cxxabi unwind builtins

/tmp/out/custom1/bin/clang++ -c -stdlib=libc++ a.cc; /tmp/out/custom1/bin/clang++ -fuse-ld=lld --dyld-prefix=/usr/aarch64-linux-gnu --unwindlib=libunwind --rtlib=compiler-rt -nostdlib++ -pthread -static-libgcc a.o -Wl,--push-state,-Bstatic,-lc++,-lc++abi,--pop-state,-rpath=/usr/aarch64-linux-gnu/lib -ldl -o a  # works
./a  # runs if you have binfmt_misc and qemu-aarch64-static

If I change the CMake line to use -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-linux-gnu (to match Debian aarch64-linux-gnu-g++), libc++ compiles do not work:

% /tmp/out/custom1/bin/clang++ -fuse-ld=lld --dyld-prefix=/usr/aarch64-linux-gnu -Wl,-rpath=/usr/aarch64-linux-gnu/lib a.cc -o a
% ./a  # works

% /tmp/out/custom1/bin/clang++ -c -stdlib=libc++ a.cc
In file included from a.cc:1:
In file included from /tmp/out/custom1/bin/../include/c++/v1/stdio.h:101:
/tmp/out/custom1/bin/../include/c++/v1/__config:13:10: fatal error: '__config_site' file not found
#include <__config_site>
         ^~~~~~~~~~~~~~~
1 error generated.

The issue is: /tmp/out/custom1/include/aarch64-linux-gnu/c++/v1/ is not in the search path.

With this patch, the command will succeed. Here are the search paths for includes and libraries:

% /tmp/out/custom1/bin/clang++ -fuse-ld=lld -stdlib=libc++ -v a.cc |& sed -E 's/ "?-[iIL]/\n&/g'
clang version 14.0.0
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/out/custom1/bin
Found candidate GCC installation: /usr/lib/gcc-cross/aarch64-linux-gnu/11
Selected GCC installation: /usr/lib/gcc-cross/aarch64-linux-gnu/11
Candidate multilib: .;@m64
Selected multilib: .;@m64
 "/tmp/out/custom1/bin/clang-14" -cc1 -triple aarch64-unknown-linux-gnu -emit-obj -mrelax-all --mrelax-relocations -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name a.cc -mrelocation-model static -mframe-pointer=non-leaf -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -target-cpu generic -target-feature +neon -target-feature +v8a -target-abi aapcs -fallow-half-arguments-and-returns -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -v -fcoverage-compilation-dir=/tmp/c -resource-dir /tmp/out/custom1/lib/clang/14.0.0
 -internal-isystem /tmp/out/custom1/bin/../include/aarch64-linux-gnu/c++/v1
 -internal-isystem /tmp/out/custom1/bin/../include/c++/v1
 -internal-isystem /tmp/out/custom1/lib/clang/14.0.0/include
 -internal-isystem /usr/local/include
 -internal-isystem /usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/include
 -internal-externc-isystem /include
 -internal-externc-isystem /usr/include -fdeprecated-macro -fdebug-compilation-dir=/tmp/c -ferror-limit 19 -fno-signed-char -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -target-feature +outline-atomics -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/a-ffa089.o -x c++ a.cc
clang -cc1 version 14.0.0 based upon LLVM 14.0.0git default target aarch64-linux-gnu
ignoring nonexistent directory "/include"
#include "..." search starts here:
#include <...> search starts here:
 /tmp/out/custom1/bin/../include/aarch64-linux-gnu/c++/v1
 /tmp/out/custom1/bin/../include/c++/v1
 /tmp/out/custom1/lib/clang/14.0.0/include
 /usr/local/include
 /usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/include
 /usr/include
End of search list.
 "/tmp/out/custom1/bin/ld.lld" -EL --eh-frame-hdr -m aarch64linux -dynamic-linker /lib/ld-linux-aarch64.so.1 -o a.out /usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/lib/crt1.o /usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/lib/crti.o /usr/lib/gcc-cross/aarch64-linux-gnu/11/crtbegin.o
 -L/tmp/out/custom1/bin/../lib/aarch64-linux-gnu
 -L/usr/lib/gcc-cross/aarch64-linux-gnu/11
 -L/usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../lib64
 -L/lib/aarch64-linux-gnu
 -L/lib/../lib64
 -L/usr/lib/aarch64-linux-gnu
 -L/usr/lib/../lib64
 -L/usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/lib
 -L/tmp/out/custom1/bin/../lib
 -L/lib
 -L/usr/lib /tmp/a-ffa089.o -lc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc-cross/aarch64-linux-gnu/11/crtend.o /usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/lib/crtn.o

In my testing, I found that these users aren't currently using a single uniform target triple spelling, I've seen both ${arch}-linux-gnu and ${arch}-unknown-linux-gnu used pretty liberally. So if we were to land this change today, it would break many of these users. We could clean up and unify all of these uses, but it's going to take some effort because it spans lots of projects. I'm also worried that there might be other Clang users in a similar situation who are not included on this change.

This is not a problem. Both -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-linux-gnu and -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-unknown-linux-gnu will work.

# /tmp/out/custom-unknown is built with -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-unknown-linux-gnu
% /tmp/out/custom-unknown/bin/clang++ -fuse-ld=lld a.cc # libstdc++ works
% /tmp/out/custom-unknown/bin/clang++ -fuse-ld=lld -stdlib=libc++ a.cc # libc++ works
MaskRay added inline comments.Jan 25 2022, 4:57 PM
clang/lib/Driver/Driver.cpp
494

llvm::Triple Target(llvm::Triple::normalize(TargetTriple)); refers to the normalized triple is references in many places. Having a TargetTriple may cause confusion.

I also want to avoid renaming Target (referenced in too many places) to minimize the diff.

1267–1270

I agree. The change seems involved so I do not do it here...

Moving toolchains::MinGW::fixTripleArch(D, Target, Args); to the toolchain-specific file will indeed be nice.

Could you please stop with the pings? and chat directly ? (It seems that you are colleagues?!)

I would like to follow what is happening here but not get flooded.

Could you please stop with the pings? and chat directly ? (It seems that you are colleagues?!)

I would like to follow what is happening here but not get flooded.

(I don't mind if you/others review this as well, and honestly I don't know why it needs to take such a long time to proceed.)
It seems that other reviewers just hand off the review burden to phosek, but they can review this by themselves as well.

This is more a Linux distro thing, less related to Fuchsia. So technically you will be a good one, especially that the whole multiarch thing is related to Debian....

Generally the guidance is one ping every week. The last ping has just 4 days, but if take the full history into account my ping frequency is much less frequent than once every week.

I'm personally ignoring this review because quite honestly i was horrified of just how nonchalantly dismissive/accusative the reaction was to the problem i raised during the revert.

No worries! Thanks for the reply.

I use Debian and want it to work well and support the -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on direction.
Currently, -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-linux-gnu (Debian style multiarch triple, so no -unknown or -pc) does not work for libc++ and some runtime libraries.

I believe this can be also addressed by changes to the CMake build without needing to change the driver. I have a series of patches and I'll try to send them for review in the next few days (I think those patches are desirable independently of which solution we ultimately choose).


Here is -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-unknown-linux-gnu:

cmake -GNinja -Hllvm -B/tmp/out/custom1 -DCMAKE_BUILD_TYPE=Release -DCMAKE_CROSSCOMPILING=on -DCMAKE_INSTALL_PREFIX=/tmp/opt/aarch64 -DLLVM_TARGETS_TO_BUILD=AArch64 -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-unknown-linux-gnu -DLLVM_TARGET_ARCH=AArch64 -DLLVM_ENABLE_PROJECTS='clang;lld' -DLLVM_ENABLE_RUNTIMES='compiler-rt;l
ibcxx;libcxxabi;libunwind'
ninja -C /tmp/out/custom1 lld cxx cxxabi unwind builtins

/tmp/out/custom1/bin/clang++ -c -stdlib=libc++ a.cc; /tmp/out/custom1/bin/clang++ -fuse-ld=lld --dyld-prefix=/usr/aarch64-linux-gnu --unwindlib=libunwind --rtlib=compiler-rt -nostdlib++ -pthread -static-libgcc a.o -Wl,--push-state,-Bstatic,-lc++,-lc++abi,--pop-state,-rpath=/usr/aarch64-linux-gnu/lib -ldl -o a  # works
./a  # runs if you have binfmt_misc and qemu-aarch64-static

If I change the CMake line to use -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-linux-gnu (to match Debian aarch64-linux-gnu-g++), libc++ compiles do not work:

% /tmp/out/custom1/bin/clang++ -fuse-ld=lld --dyld-prefix=/usr/aarch64-linux-gnu -Wl,-rpath=/usr/aarch64-linux-gnu/lib a.cc -o a
% ./a  # works

% /tmp/out/custom1/bin/clang++ -c -stdlib=libc++ a.cc
In file included from a.cc:1:
In file included from /tmp/out/custom1/bin/../include/c++/v1/stdio.h:101:
/tmp/out/custom1/bin/../include/c++/v1/__config:13:10: fatal error: '__config_site' file not found
#include <__config_site>
         ^~~~~~~~~~~~~~~
1 error generated.

The issue is: /tmp/out/custom1/include/aarch64-linux-gnu/c++/v1/ is not in the search path.

With this patch, the command will succeed. Here are the search paths for includes and libraries:

% /tmp/out/custom1/bin/clang++ -fuse-ld=lld -stdlib=libc++ -v a.cc |& sed -E 's/ "?-[iIL]/\n&/g'
clang version 14.0.0
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/out/custom1/bin
Found candidate GCC installation: /usr/lib/gcc-cross/aarch64-linux-gnu/11
Selected GCC installation: /usr/lib/gcc-cross/aarch64-linux-gnu/11
Candidate multilib: .;@m64
Selected multilib: .;@m64
 "/tmp/out/custom1/bin/clang-14" -cc1 -triple aarch64-unknown-linux-gnu -emit-obj -mrelax-all --mrelax-relocations -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name a.cc -mrelocation-model static -mframe-pointer=non-leaf -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -target-cpu generic -target-feature +neon -target-feature +v8a -target-abi aapcs -fallow-half-arguments-and-returns -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -v -fcoverage-compilation-dir=/tmp/c -resource-dir /tmp/out/custom1/lib/clang/14.0.0
 -internal-isystem /tmp/out/custom1/bin/../include/aarch64-linux-gnu/c++/v1
 -internal-isystem /tmp/out/custom1/bin/../include/c++/v1
 -internal-isystem /tmp/out/custom1/lib/clang/14.0.0/include
 -internal-isystem /usr/local/include
 -internal-isystem /usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/include
 -internal-externc-isystem /include
 -internal-externc-isystem /usr/include -fdeprecated-macro -fdebug-compilation-dir=/tmp/c -ferror-limit 19 -fno-signed-char -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -target-feature +outline-atomics -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/a-ffa089.o -x c++ a.cc
clang -cc1 version 14.0.0 based upon LLVM 14.0.0git default target aarch64-linux-gnu
ignoring nonexistent directory "/include"
#include "..." search starts here:
#include <...> search starts here:
 /tmp/out/custom1/bin/../include/aarch64-linux-gnu/c++/v1
 /tmp/out/custom1/bin/../include/c++/v1
 /tmp/out/custom1/lib/clang/14.0.0/include
 /usr/local/include
 /usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/include
 /usr/include
End of search list.
 "/tmp/out/custom1/bin/ld.lld" -EL --eh-frame-hdr -m aarch64linux -dynamic-linker /lib/ld-linux-aarch64.so.1 -o a.out /usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/lib/crt1.o /usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/lib/crti.o /usr/lib/gcc-cross/aarch64-linux-gnu/11/crtbegin.o
 -L/tmp/out/custom1/bin/../lib/aarch64-linux-gnu
 -L/usr/lib/gcc-cross/aarch64-linux-gnu/11
 -L/usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../lib64
 -L/lib/aarch64-linux-gnu
 -L/lib/../lib64
 -L/usr/lib/aarch64-linux-gnu
 -L/usr/lib/../lib64
 -L/usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/lib
 -L/tmp/out/custom1/bin/../lib
 -L/lib
 -L/usr/lib /tmp/a-ffa089.o -lc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc-cross/aarch64-linux-gnu/11/crtend.o /usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/lib/crtn.o

Thanks for the reproducer, that's helpful.

In my testing, I found that these users aren't currently using a single uniform target triple spelling, I've seen both ${arch}-linux-gnu and ${arch}-unknown-linux-gnu used pretty liberally. So if we were to land this change today, it would break many of these users. We could clean up and unify all of these uses, but it's going to take some effort because it spans lots of projects. I'm also worried that there might be other Clang users in a similar situation who are not included on this change.

This is not a problem. Both -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-linux-gnu and -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-unknown-linux-gnu will work.

# /tmp/out/custom-unknown is built with -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-unknown-linux-gnu
% /tmp/out/custom-unknown/bin/clang++ -fuse-ld=lld a.cc # libstdc++ works
% /tmp/out/custom-unknown/bin/clang++ -fuse-ld=lld -stdlib=libc++ a.cc # libc++ works

The issue is not in how we build but how we use the toolchain.

Say we provide our toolchain to projects A and B. Project A uses clang --target=aarch64-linux-gnu .... Project B uses clang --target=aarch64-unknown-linux-gnu .... Today, this works fine because we internally normalize the target to aarch64-unknown-linux-gnu and use it find headers and libraries. With this change, depending on how we build the toolchain, either project A or B is going to break.

This is the situation we're in right now, except the number of projects is significantly higher and so is the number of different targets and target triple spellings.

MaskRay updated this revision to Diff 409171.EditedFeb 16 2022, 1:19 AM
MaskRay edited the summary of this revision. (Show Details)

Detect both unnormalized and normalized paths for compatibility. hack when default -DLLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu on Debian.

-DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-unknown-linux-gnu
/tmp/out/custom0/bin/clang++ --stdlib=libc++ -fsanitize=address a.cc
/tmp/out/custom0/bin/clang++ --target=x86_64-linux-gnu --stdlib=libc++ -fsanitize=address a.cc

-DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-linux-gnu
/tmp/out/custom1/bin/clang++ --stdlib=libc++ -fsanitize=address a.cc
/tmp/out/custom1/bin/clang++ --target=x86_64-linux-gnu --stdlib=libc++ -fsanitize=address a.cc

Say we provide our toolchain to projects A and B. Project A uses clang --target=aarch64-linux-gnu .... Project B uses clang --target=aarch64-unknown-linux-gnu .... Today, this works fine because we internally normalize the target to aarch64-unknown-linux-gnu and use it find headers and libraries. With this change, depending on how we build the toolchain, either project A or B is going to break.

The new diff makes this work, though in the long term (a) -DLLVM_DEFAULT_TARGET_TRIPLE should use x86_64-linux-gnu on Debian and (b) mismatching --target= and -DLLVM_DEFAULT_TARGET_TRIPLE should be unsupported: https://discourse.llvm.org/t/rfc-fix-loose-behaviors-of-clang-target/60272

MaskRay added a comment.EditedFeb 16 2022, 1:32 AM

No worries! Thanks for the reply.

I use Debian and want it to work well and support the -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on direction.
Currently, -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-linux-gnu (Debian style multiarch triple, so no -unknown or -pc) does not work for libc++ and some runtime libraries.

I believe this can be also addressed by changes to the CMake build without needing to change the driver. I have a series of patches and I'll try to send them for review in the next few days (I think those patches are desirable independently of which solution we ultimately choose).

It could be addressed by letting CMake normalize LLVM_DEFAULT_TARGET_TRIPLE before constructing the multiarch search path, but I think that's the wrong direction.
The only way allowing removal of various normalization code in CMake/ClangDriver is to do no normalization. Just respect user-specified target triples (from LLVM_DEFAULT_TARGET_TRIPLE, or from --target=) as is.

Users keep adding more triples to Generic_GCC::GCCInstallationDetector::CollectLibDirsAndTriples lists. They are workarounds that LLVM_DEFAULT_TARGET_TRIPLE is not correct (does not reflect the system multiarch library path). If we ensure LLVM_DEFAULT_TARGET_TRIPLE reflects the truth, we can remove nearly all hacks from Generic_GCC::GCCInstallationDetector::CollectLibDirsAndTriples

MaskRay edited the summary of this revision. (Show Details)Feb 16 2022, 11:56 AM
MaskRay abandoned this revision.Jul 14 2022, 10:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2022, 10:08 PM