This is an archive of the discontinued LLVM Phabricator instance.

[Driver] Perform Linux distribution detection just once
ClosedPublic

Authored by dmantipov on Sep 5 2020, 4:45 AM.

Details

Summary

When running regular ('clang++ foo.cc') native compilation on Linux target in a typical configuration, an attempt to detect distribution type is performed 4 times at least:

  • during an attempt to detect CUDA installation in clang::driver::CudaInstallationDetector::CudaInstallationDetector();
  • during an instantiation of Linux-specific toolchain driver (clang::driver::toolchains::Linux::Linux());
  • when special handling of '-faddrsig' is performed in clang::driver::tools::Clang::ConstructJob();
  • finally when constructing a linker job in clang::driver::toolchains::Linux::getDynamicLinker().

Since distribution detection involves VFS, filesystem I/O and even parsing of file(s) contents, it looks desirable to cache the result and perform the whole procedure just once.

Diff Detail

Event Timeline

dmantipov created this revision.Sep 5 2020, 4:45 AM
dmantipov requested review of this revision.Sep 5 2020, 4:45 AM
dmantipov updated this revision to Diff 290281.Sep 7 2020, 6:55 AM

Add trivial {/etc,/usr/lib}/os-release parser and fix tests.

I guess it isn't possible to write tests for this?

Does it have an impact on cross compile ?

Also, please fix the clang-format warning!

clang/lib/Driver/Distro.cpp
229

you could probably do "return Distro::Un" here
(same line 230)

aganea added a subscriber: rnk.Sep 17 2020, 5:54 AM

Please install & run git clang-format before uploading the patch.

clang/include/clang/Driver/Distro.h
27

You can leave out the -1 since this isn't serialized.

clang/lib/Driver/Distro.cpp
31–48

Early exit & move L23 to L32.

52–53

Early exit to avoid indent and ease reading:

if (!File)
  return Distro::UnknownDistro;

Also move L50 to L57.

207

I'm not sure if this is safe. @rnk Do we expect to call GetDistro with a different VFS or a different Triple when compiling with Clang? For GPU targets? Shouldn't we do this higher up, on the call site?

rnk added inline comments.Sep 17 2020, 10:46 AM
clang/lib/Driver/Distro.cpp
207

I think there is a plausible use case for a compile server that does two compiles with different VFSs where you would get different distro detection results. I think this should be a field on the Driver object instead of a global variable.

I know LLD uses lots of globals, but that's not part of the clang design philosophy.

rnk added inline comments.Sep 17 2020, 10:52 AM
clang/lib/Driver/Distro.cpp
207

Sorry, I spoke too soon.

For @aganea's concern, my understanding is that the distro is a property of the host machine doing the compilation. Even if we do two compiles in the same process on the same machine with two triples, they should have the same distro, if they are using the same real FS. At least, the FS shouldn't change out from under us in a way that changes the result of distro detection. :) I think it's reasonable to use distro detection to inform header search paths, but we should avoid using it to control properties of the output, like stack protection security features, for example. Otherwise, it becomes impossible to target one Linux distro from another, because the distro does not appear to be controllable by a command line flag.

And, now that I see what is happening with the real FS checks here, I withdraw my concern. Even if you did make this a property of the driver, I would want some global variable somewhere to check that we don't end up doing the distro check twice. It's the only way to really be sure that some passerby doesn't introduce a new distro check. =P

Updated (style adjustments).

dmantipov marked 3 inline comments as done.Sep 18 2020, 12:19 AM

What's next with this? Should I add more reviewers?

rnk added a comment.Sep 23 2020, 11:11 AM

I think we can go forward with the reviewers we have. I have one more concern. Are the other reviewers happy?

clang/lib/Driver/Distro.cpp
207

I guess I have one more concern: this code isn't thread safe. It's unlikely to every be called from multiple threads, but let's use the safe code pattern anyway. You can use lazy static local initialization like so:

static ... GetDistro(...) {
  if (FS is Real FS && on Linux ...) {
    static const Distro::DistroType hostDistro = computeRealDistro();
    return hostDistro;
  }
  return computeDistroWithVFS();
}

IIUC compiler driver is not intended to be multithreaded. But OK, here is the version with llvm::call_once(...).

IIUC compiler driver is not intended to be multithreaded. But OK, here is the version with llvm::call_once(...).

Thanks! There could be at least two cases I see, where the driver could be multi-threaded:

  1. once clang-cl /MP -fintegrated-cc1 is supported (D52193), I'm still planning to work on that, although in that case, it wouldn't reach this distro detection. However that patch could be extended to support clang ... -j.
  2. Further, if the idea of llvm-buildozer ever lands (D86351) which in essence is a generalization of the above.

There's a slight issue with your last update, it now fails the tests (because several distros are checked one after another in the same process). You can perhaps leverage EXPECT_EXIT(..., ::testing::ExitedWithCode(0)) in the distro tests to run each of them in a separate process (see llvm/unittests/Support/ErrorTest.cpp for an example).
Also please run git clang-format.

rnk accepted this revision.Sep 24 2020, 10:43 AM

llvm::call_once is fine, but IMO the static local version is preferable: it's built in to the language, no headers to include.

clang/lib/Driver/Distro.cpp
224

I think this would work and be simpler than call_once:

static Distro::DistroType LinuxDistro = DetectDistro(llvm::vfs::getRealFileSystem());

I could be wrong.

This revision is now accepted and ready to land.Sep 24 2020, 10:43 AM
dmantipov updated this revision to Diff 294302.Sep 25 2020, 7:31 AM
dmantipov marked an inline comment as done.

Well, the problem with tests seems to be a bit wider - tests uses llvm::vfs::InMemoryFileSystem, which is not "real", so detection will return Distro::UnknownDistro anyway. So I think we can just add fallback branch for such pseudo filesystems and live with current tests as is.

aganea accepted this revision.Sep 25 2020, 9:25 AM

Makes sense, you cache the commonly taken path, but not the (very uncommon) VFS codepath.

LGTM, with some minor comments. Thanks again!

clang/include/clang/Driver/Distro.h
119

Please leave out code that doesn't participate in the patch. You can always commit NFC patches afterward with just clang-format changes if you wish. It's fine though if you move code around.

clang/lib/Driver/Distro.cpp
225

You can leave out the once_flag/call_once and simply apply Reid's suggestion, since they are equivalent (static initialization is thread safe [1]):

if (onRealFS) {
  static Distro::DistroType LinuxDistro = DetectDistro(VFS);
  return LinuxDistro;
}

[1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3797.pdf, section 6.7.4 "If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization"

This revision was automatically updated to reflect the committed changes.