Page MenuHomePhabricator

[Driver][Distro] Fix ArchLinux sysroot detection
AbandonedPublic

Authored by 10ne1 on Sep 22 2022, 10:05 AM.

Details

Summary

While cross-compiling Linux kernel tools with Clang on ArchLinux, it
was revealed the sysroot is not properly detected and instead of fixing
the root cause in Clang, kernel developers started doing workarounds [1]
upon workarounds [2] in the kernel build to be able to pass a working
sysroot during Clang invocation.

To give a concrete example, ArchLinux installs its cross-compliation
sysroot in /usr/aarch64-linux-gnu for triplet aarch64-linux-gnu.
The sysroot is not properly detected in Linux::computeSysRoot() due
to the complicated code combining the GCC check with the mips test.

Fixing this bug allows to significantly clean up the kernel build as well
as to fix the cross-compilation detection on ArchLinux, as well as to
simplify the clang sysroot function.

[1] https://github.com/torvalds/linux/commit/cebdb7374577ac6e14afb11311af8c2c44a259fa
[2] https://github.com/torvalds/linux/commit/7fd9fd46a459272e641be78c1cc36baab1921fa1

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2022, 10:05 AM
10ne1 requested review of this revision.Sep 22 2022, 10:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2022, 10:05 AM

Thanks for the patch. Can you please post a full diff (git diff -U9999).

Adding @MaskRay as a reviewer.

10ne1 updated this revision to Diff 462423.Sep 23 2022, 2:06 AM

Regenerated diff with git diff HEAD~1 -U999999

Is it worth contacting the package maintainer for LLVM+clang for Arch-Linux in regards to this patch?

clang/lib/Driver/Distro.cpp
210 ↗(On Diff #462423)

typo

213 ↗(On Diff #462423)

Twine has an intentionally non-explicit constructor that accepts a StringRef, which also has an intentionally non-explicit constructor that accepts a std::string, which is what Triple::normalize() returns.

Let's be consistent in the explicit construction of these temporary objects by removing the explicit call to the Twine constructor.

Also, why is it necessary to normalize the Triple? Can you give an example of the "bad" input and how this "fixes" it?

clang/lib/Driver/ToolChains/Linux.cpp
362–379

Do we need the Arch-linux test before the call to GCCInstallation.isValid()? Otherwise it seems like this duplicates a fair amount of code computing the Path. The initial parts of the Path seem to match; there's only more to it if the Distro is not arch-linux.

10ne1 added a comment.EditedSep 26 2022, 1:44 AM

Is it worth contacting the package maintainer for LLVM+clang for Arch-Linux in regards to this patch?

Yes, thanks for the suggestion, I will open an issue on the Arch bug tracker for clang so the maintainer is aware and he can also be part of the discussion here.

I also plan to back-port this fix to Arch's clang pkg, once it gets into a suitable form in LLVM so having the Arch PKGBUILD fixed (Arch is a rolling release, no stable backports required) would allow us to also clean up the Linux kernel tools tree.

10ne1 added inline comments.Sep 26 2022, 2:46 AM
clang/lib/Driver/Distro.cpp
213 ↗(On Diff #462423)

I do not claim to fully understand the LLVM toolchain & triplet auto-detection code, so maybe this normalization is more of a workaround than a real fix. Maybe we need to do the normalization earlier? I do not know, any suggestions are welcome.

The behavior I'm seeing is:

If TargetOrHost triplet is "aarch64-linux-gnu" then TargetOrHost.isOSLinux() == false and GetDistro returns Distro::UnknownDistro which causes failures like the following when building the kernel tools:

make ARCH=arm64 CC=clang CROSS_COMPILE=aarch64-linux-gnu- bpf
  DESCEND bpf

Auto-detecting system features:
...                        libbfd: [ OFF ]
...        disassembler-four-args: [ OFF ]


  DESCEND bpftool

Auto-detecting system features:
...                        libbfd: [ on  ]
...        disassembler-four-args: [ on  ]
...                          zlib: [ OFF ]
...                        libcap: [ OFF ]
...               clang-bpf-co-re: [ on  ]


make[2]: *** No rule to make target '/home/adi/workspace/cola/GOO0021/chromiumos/src/third_party/kernel/v5.15/tools/include/linux/math.h', needed by 'btf.o'.  Stop.
make[1]: *** [Makefile:110: bpftool] Error 2
make: *** [Makefile:69: bpf] Error 2

If we do the triple normalization step before detecting the distro, the triplet becomes aarch64-unknown-linux-gnu which results in TargetOrHost.isOSLinux() == true, the distro is properly detected, then the system features are ON and the build works.

10ne1 added inline comments.Sep 26 2022, 2:57 AM
clang/lib/Driver/ToolChains/Linux.cpp
362–379

In my testing on ArchLinux, GCCInstallation.isValid() is always true.

The problem is that if test condition doesn't just test for a valid GCC install, it also tests getTriple().isMIPS() which is always false on Arch Linux which does not support mips, so the sysroot will always be an empty string.

If you wish I can create a separate commit / review to untangle if (!GCCInstallation.isValid() || !getTriple().isMIPS()) and try to reduce the code duplication, because really having a valid GCC install is independent from running on mips. What do you think?

nickdesaulniers requested changes to this revision.Sep 26 2022, 2:51 PM

It's not clear to me why Arch is special here. The path may be different, but normalizing the triple seems unnecessary. Blocking this until I'm confident that we're getting to the root cause of the issue here.

clang/lib/Driver/Distro.cpp
213 ↗(On Diff #462423)

If TargetOrHost triplet is "aarch64-linux-gnu" then TargetOrHost.isOSLinux() == false

What?! That sounds like a bug. What does Triple::getOS() return in that case?

Otherwise it sounds like GetDistro is getting called to early, before whatever sets the OS has been initialized correctly.

clang/lib/Driver/ToolChains/Linux.cpp
362–379

That doesn't make sense.

If GCCInstallation.isValid() is always true then we should not be returning the empty string.

This revision now requires changes to proceed.Sep 26 2022, 2:51 PM
MaskRay added a comment.EditedSep 26 2022, 3:50 PM

I am nervous as well as Arch Linux has many derivatives. They likely don't use ID=arch in /etc/os-release. The patch won't work for them.
In general, I don't think our current approach adding new Distro::* flavors scale or really meet the needs of numerous less-popular distributions.

The last few comments of https://discourse.llvm.org/t/rfc-adding-a-default-file-location-to-config-file-support/63606 discuss a generic mechanism solving the distribution difference problem with configuration files.
Also, the new driver option --gcc-install-dir (https://discourse.llvm.org/t/add-gcc-install-dir-deprecate-gcc-toolchain-and-remove-gcc-install-prefix/65091 ; milestone: 16.0.0) can be useful for some tasks.

10ne1 added inline comments.Sep 27 2022, 2:00 AM
clang/lib/Driver/Distro.cpp
213 ↗(On Diff #462423)

I've done some further debugging and turns out there was an error in my test environment due to the Linux kernel tools/ cleanup patch not being correctly applied.

After fixing that I can confirm the values for the triple are correct and this normalization is not required anymore:

llvm::dbgs() << "TargetOrHost.getOS() = " << TargetOrHost.getOS() << "\n";
llvm::dbgs() << "TargetOrHost.isOSLinux() = " << TargetOrHost.isOSLinux() << "\n";

produces:

TargetOrHost.isOSLinux() = 1
TargetOrHost.getOS() = 9

which is what we expect.

I will drop this specific triplet code and update the diff, the only remaining open issue is the sysroot detection change below. Thanks for your patience!

clang/lib/Driver/ToolChains/Linux.cpp
362–379

It does makes sense if you read the condition carefully:

if (!GCCInstallation.isValid() || !getTriple().isMIPS())

results in:

if ( ! true || ! false )

which means an empty string is always returned because Arch does not support mips even though a valid GCC install is present.

I think I'll just go ahead and clean up this code and update the diff to drop the triple normalization.

10ne1 added a comment.Sep 27 2022, 2:12 AM

I am nervous as well as Arch Linux has many derivatives. They likely don't use ID=arch in /etc/os-release. The patch won't work for them.
In general, I don't think our current approach adding new Distro::* flavors scale or really meet the needs of numerous less-popular distributions.

The last few comments of https://discourse.llvm.org/t/rfc-adding-a-default-file-location-to-config-file-support/63606 discuss a generic mechanism solving the distribution difference problem with configuration files.
Also, the new driver option --gcc-install-dir (https://discourse.llvm.org/t/add-gcc-install-dir-deprecate-gcc-toolchain-and-remove-gcc-install-prefix/65091 ; milestone: 16.0.0) can be useful for some tasks.

@MaskRay I agree the current /etc/os-release and Distro::* based OS detection is not scalable and will require adding more and more per-distro exceptions, which is ugly.

However what other option do we currently have other than waiting for the generic distro mechanism to be implemented and continue doing workarounds in projects depending on LLVM/Clang like the kernel?

My suggestion is to add a Distro::* exception to fix Arch Linux now which will also allow to clean up the kernel code and afterwards, when the generic mechanism is finally implemented, this along with all the other Distro:* ugliness can be dropped. Basically I try to avoid blocking for an indefinite amount of time for a proper solution, which when it will finally come, will drop this code anyway.

What do you think? If you say we should wait, that is fine with me as well.

10ne1 updated this revision to Diff 463177.Sep 27 2022, 3:37 AM
10ne1 retitled this revision from [Driver][Distro] Fix ArchLinux triplet and sysroot detection to [Driver][Distro] Fix ArchLinux sysroot detection.
10ne1 edited the summary of this revision. (Show Details)
10ne1 marked 2 inline comments as done.Sep 27 2022, 3:42 AM

@nickdesaulniers @MaskRay I've updated the diff & summary based on the review comments which I've marked as done. Please tell me what you think. Thank you!

10ne1 updated this revision to Diff 463189.Sep 27 2022, 5:00 AM

Fixed some clang-format problems.

clang/lib/Driver/ToolChains/Linux.cpp
357–358

hoist

357–358

duplication with below...

358–359

Cool, now this can be hoisted above if (getTriple().isCSKY()) { and not checked again within that condition.

360–361

I see that getVFS() is called already without going through getDriver() in this method. Can we do so as well here?

361–383

...seems like some of this is duplicated with CSKY above. Can you please refactor the MIPS and CSKY checks to reuse more code?

I doubt arch-linux has a CSKY port; I suspect you might be able to check for arch-linux first, then do the CSKY and MIPS special casing after.

All this code is doing is figuring out a Path, then if it exists then return it. Feel like is should be:

if android:
  path = ...
elif arch:
  path = ...
elif csky:
  path = ...
elif mips:
  path = ...
else:
  return ""

if path.exists():
  return path
return ""
362–379

Ah, ok. Yeah your refactoring is making this clearer. I think we can refactor this method a bit more (in addition to fixing this Arch-Linux specific issue).

10ne1 marked 5 inline comments as done.Sep 28 2022, 2:15 AM

FYI: @MaskRay I think you will be very happy that after the simplifications Nick suggested the Distro::* additions are not necessary anymore.

clang/lib/Driver/ToolChains/Linux.cpp
361–383

Thanks for this suggestion. Indeed after doing all the simplifications I have some great news: we do not need to test if Distro == ArchLinux because the sysroot path it uses is the implicit base path one common to all architectures!

So just doing the following is enough to also fix ArchLinux:

std::string Path = (InstallDir + "/../../../../" + TripleStr).str();

if (getTriple().isCSKY())
  Path = Path + "/libc";

if (getTriple().isMIPS()) {
  ...
 }

if (getVFS().exists(Path))
  return Path;

return std::string();
10ne1 updated this revision to Diff 463465.Sep 28 2022, 2:17 AM
10ne1 marked an inline comment as done.
10ne1 edited the summary of this revision. (Show Details)
clang/lib/Driver/ToolChains/Linux.cpp
361–383

Yes, this is much nicer in that we don't have to special case arch-linux. Sometimes refactorings beget more refactorings; it's nice when that happens.

363

With all of the string concatenation going on, I wonder if Path should be a Twine? llvm::vfs::Filesystem::exists accepts a const Twine&. That avoids multiple reallocations and copies, and does one lazily.

(Every time I've tried to use Twine, I wind up with either -Wdangling-gsl or segfaults though! Still, please give it a shot.)

369–375

CSKY and MIPS should be mutually exclusive. Please make this second if an else if.

if csky:
  ...
elif mips:
  ...
370

+=

373

Specifically there's 2 known variants for mips, it looks like. Please update this comment.

376

It might be worthwhile to have the variable be

std::string OSSuffix = GCCInstallation.getMultilib().osSuffix();

rather than the Multilib object.

377

+=

clang/lib/Driver/ToolChains/Linux.cpp
376

err... I guess a std::string&

10ne1 marked 7 inline comments as done.Sep 29 2022, 6:16 AM
10ne1 added inline comments.
clang/lib/Driver/ToolChains/Linux.cpp
363

I tried making Twine work but I got many errors, couldn't even make it build. Looks like twine doesn't have += operator for strings, and I also got many error: use of deleted function ‘llvm::Twine& llvm::Twine::operator=(const llvm::Twine&)’. C++ is such a mystery language.

Please let's just leave it a string for now, it's enough cleanups. :) Thanks!

10ne1 updated this revision to Diff 463864.Sep 29 2022, 6:17 AM
10ne1 marked an inline comment as done.

Updated based on feedback from Nick.

nickdesaulniers accepted this revision.Sep 29 2022, 10:00 AM

Looks great! Nice job @10ne1 . Need me to commit this for you?

This revision is now accepted and ready to land.Sep 29 2022, 10:00 AM

Looks great! Nice job @10ne1 . Need me to commit this for you?

Yes please, thank you!

Since this decreases complexity, it's probably fine. But we really need some tests in clang/test/Driver see linux-cross.cpp

MaskRay requested changes to this revision.EditedSep 29 2022, 11:12 AM

I'll grab an Arch Linux machine for testing, but I don't think this is currently in a form for submitting.
This adds new functionality for non-MIPS and we need some fake file hierarchies (like those used in linux-cross.cpp).
I'll add the test, though, and submit this for you.

Request changes for now.

This revision now requires changes to proceed.Sep 29 2022, 11:12 AM
MaskRay added inline comments.Sep 29 2022, 11:14 AM
clang/lib/Driver/ToolChains/Linux.cpp
369

If the else branch uses braces, the then branch needs brances as well.
https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements

I can fix this.

MaskRay added a comment.EditedSep 29 2022, 11:18 AM

Another opinion is whether we actually need the more and more complex sysroot computation logic.
Inferring sysroot from GCCInstallation looks backward to me. We should get sysroot first, then compute GCCInstallation, not the order as implemented in this patch.

Yes, I see mips (https://reviews.llvm.org/rG08450bd55ccdc4aee4f5f73cde97e25b3c4ce5b9 2013), Android (D45291 in 2018) and csky (D121445) special cases. To be fair, if I saw them earlier, I'd raise my concerns as well.


https://reviews.llvm.org/D134337 (default configuration file) will soon land and we can move to require distributions to provide a config file instead.

I'll grab an Arch Linux machine for testing, but I don't think this is currently in a form for submitting.
This adds new functionality for non-MIPS and we need some fake file hierarchies (like those used in linux-cross.cpp).
I'll add the test, though, and submit this for you.

Request changes for now.

Ok. Thanks, please ping if you need any action on my side.

I'll grab an Arch Linux machine for testing, but I don't think this is currently in a form for submitting.
This adds new functionality for non-MIPS and we need some fake file hierarchies (like those used in linux-cross.cpp).
I'll add the test, though, and submit this for you.

Request changes for now.

Ok. Thanks, please ping if you need any action on my side.

My latest thought is that this patch is going toward a wrong direction: https://reviews.llvm.org/D134454#3824630

10ne1 added a comment.Sep 30 2022, 6:03 AM

I'll grab an Arch Linux machine for testing, but I don't think this is currently in a form for submitting.
This adds new functionality for non-MIPS and we need some fake file hierarchies (like those used in linux-cross.cpp).
I'll add the test, though, and submit this for you.

Request changes for now.

Ok. Thanks, please ping if you need any action on my side.

My latest thought is that this patch is going toward a wrong direction: https://reviews.llvm.org/D134454#3824630

To verify I understand correctly what you are suggesting: should we close this review and wait for distros like Arch to specify the sysroots via the new config files mechanism? (Arch is currently at LLVM 14.0.6, not sure when they will upgrade).

This will also block the kernel cleanups, but if @nickdesaulniers also agrees to postpone that kernel work, then ok, let's wait.

Another opinion is whether we actually need the more and more complex sysroot computation logic.
Inferring sysroot from GCCInstallation looks backward to me. We should get sysroot first, then compute GCCInstallation, not the order as implemented in this patch.

Yes, I see mips (https://reviews.llvm.org/rG08450bd55ccdc4aee4f5f73cde97e25b3c4ce5b9 2013), Android (D45291 in 2018) and csky (D121445) special cases. To be fair, if I saw them earlier, I'd raise my concerns as well.


https://reviews.llvm.org/D134337 (default configuration file) will soon land and we can move to require distributions to provide a config file instead.

I disagree with the assessment that this patch makes sysroot detection more complex; it is a simplification and bug fix. Detecting the sysroot first, then GCCInstallation is a higher risk change than this patch.

The default configuration file is interesting, but it's not clear what even arch-linux would put in their configuration file; this patch cleans up what looks like a bug to me introduced in https://reviews.llvm.org/rG08450bd55ccdc4aee4f5f73cde97e25b3c4ce5b9 that folks have been working around downstream for quite some time.


I think @10ne1 just needs to fix up the style nit, and a test might be nice.

clang/test/Driver/linux-cross.cpp already has a test using a "fake" Arch-Linux tree in clang/test/Driver/Inputs/archlinux_i686_tree/. A new "fake" tree could be added to emulate the current way arch lays out these sysroots.

clang/lib/Driver/ToolChains/Linux.cpp
375

clang/lib/Driver/ToolChains/Linux.cpp Linux::Linux has:

if (IsCSKY)
  SysRoot = SysRoot + SelectedMultilib.osSuffix();

lol. So we should either do this for BOTH mips and csky here, or there.

While cross-compiling Linux kernel tools with Clang on ArchLinux, it was revealed the sysroot is not properly detected and instead of fixing the root cause in Clang, kernel developers started doing workarounds [1] upon workarounds [2] in the kernel build to be able to pass a working sysroot during Clang invocation.

I disagree that [1] and [2] are workarounds. Figuring out the sysroot and GCC installation directory is the canonical and bullet-proof way to let Clang reuse the glibc/GCC files.
It works for system cross GCC packages as well as user-built cross GCC packages, regardless how sysroot is configured.
Other approaches require file hierarchy detection which is more or less brittle. Some is unavoidable as we want Clang to be a drop-in replacement of GCC.
Unfortunately traditionally we did not set a clear bar and many extended system packages are supported, e.g. the strange mips distribution detection, gentoo-config, RedHat devtools, etc.
Adding such extended support just doesn't scale. Thankfully the main branch has now a fullblown configuration file support (probably with some rough edges) and they can be easily
extended to support a new distribution's bespoke GCC customization.

For the Arch Linux example, what if another distribution claims that they use /usr/aarch64-linux-gnu/sysroot and suit them better?
What if it is another directory?

Clang Driver detects a GCC installation in the sysroot, then it is backward to change the sysroot after a GCC installation is found.
Worse, computeSysroot result does not completely replace reading the stored Sysroot variable.
Some places (e.g. -isysroot, --print-search-dirs) use the Sysroot variable.
I am inclined to think that they are correct and what we actually need to fix are mips/Android/CSKY in Linux::computeSysRoot.

10ne1 added a comment.Oct 3 2022, 2:20 AM

@MaskRay and @nickdesaulniers Can you please work together to reach a consensus on what is the best path forward? I am ok either way, just need to know what the next steps are. :) Thank you.

MaskRay added a comment.EditedOct 3 2022, 4:06 PM
  • This patch should be abandoned. Inferring sysroot from gcc-toolchain is backward.
  • The D644 (Support Sourcery CodeBench MIPS toolchain) computeSysRoot should not be used by new support.
  • --gcc-toolchain= is discouraged.
  • New cross compilation support should specify --target= with --sysroot= and possibly --gcc-install-dir=.
  • If a distro wants to just use --target=, provide a configuration file. Gentoo folks are working on supporting gcc-config without adding complexity to Clang Driver: https://github.com/llvm/llvm-project/issues/57570 . Other distributions should follow.

The following fragment in kernel tools/scripts/Makefile.include looks right to me. I don't think it is a workaround. In the future --gcc-toolchain= should be replaced with --gcc-install-dir=

else ifneq ($(CROSS_COMPILE),)
# Allow userspace to override CLANG_CROSS_FLAGS to specify their own
# sysroots and flags or to avoid the GCC call in pure Clang builds.
ifeq ($(CLANG_CROSS_FLAGS),)
CLANG_CROSS_FLAGS := --target=$(notdir $(CROSS_COMPILE:%-=%))
GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)gcc 2>/dev/null))
ifneq ($(GCC_TOOLCHAIN_DIR),)
CLANG_CROSS_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(notdir $(CROSS_COMPILE))
CLANG_CROSS_FLAGS += --sysroot=$(shell $(CROSS_COMPILE)gcc -print-sysroot)
CLANG_CROSS_FLAGS += --gcc-toolchain=$(realpath $(GCC_TOOLCHAIN_DIR)/..)
endif # GCC_TOOLCHAIN_DIR
endif # CLANG_CROSS_FLAGS
CFLAGS += $(CLANG_CROSS_FLAGS)
AFLAGS += $(CLANG_CROSS_FLAGS)
endif # CROSS_COMPILE
10ne1 abandoned this revision.Oct 6 2022, 1:24 AM

It's fine for CSKY to use config file. I only have 2 points.

  1. I agree sysroot should be separated from GCC because sysroot is not dependent to GCC and there is even not gcc when we use llvm runtime. This rule should also apply to multilib logic. Sysroot can detect multilib path individually.
  2. From user view, it's not easy to add sysroot or config file path manually, so it needs good way to enable this functionality.

It's fine for CSKY to use config file. I only have 2 points.

Thanks. It'll be nice for CSKY to move away from changing computeSysRoot.

  1. I agree sysroot should be separated from GCC because sysroot is not dependent to GCC and there is even not gcc when we use llvm runtime. This rule should also apply to multilib logic. Sysroot can detect multilib path individually.
  2. From user view, it's not easy to add sysroot or config file path manually, so it needs good way to enable this functionality.

There is a good summary on the recent configuration file improvements: https://blogs.gentoo.org/mgorny/2022/10/07/clang-in-gentoo-now-sets-default-runtimes-via-config-file/
The proper way is to let the system GCC or a similar package to provide a Clang configuration file.