This is an archive of the discontinued LLVM Phabricator instance.

[docs] Added solutions to slow build under common problems
ClosedPublic

Authored by e-leclercq on Mar 1 2020, 10:17 AM.

Details

Summary

I added a list of options to configure should someone have issues with long build time or running out of memory. This was added under common problems in the getting started section of the documentation.

Diff Detail

Event Timeline

e-leclercq created this revision.Mar 1 2020, 10:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2020, 10:17 AM

I think this also should talk about make vs ninja generator, and using at least ld.gold or even lld.

llvm/docs/GettingStarted.rst
1104

Should mention that this is ninja specific.
And i think similar to the -j option used with make. is missing words "but for link jobs only"

Issues with compile time and memory requirements come up regularly, so we though it would be a good idea to document mitigations. For instance:

https://lists.llvm.org/pipermail/llvm-dev/2020-January/137883.html

https://lists.llvm.org/pipermail/llvm-dev/2019-June/133506.html

http://lists.llvm.org/pipermail/llvm-dev/2019-November/137226.html

https://reviews.llvm.org/D72402

Documenting this has been suggested in those threads several times. I think a separate section just covering this problem would be a good idea. Other parts in the manual (such as https://llvm.org/docs/CMake.html#options-and-variables) are references, and my not be where someone would look when someone tries to compile LLVM the first time.

Thanks for this! I was thinking about adding this as well because we get failed build emails to the list all the time.

We could recommend using -DLLVM_ENABLE_LLD=ON or gold if they don't have lld. Also -DBUILD_SHARED_LIBS=ON should help with memory a lot too.

Some people try to compile with very little memory, and that's what ultimately kills these builds. I think we should mention that there is a theoretical limit to how small of a memory footprint we can create with these options and sometimes they will be forced to increase their swap space, if their physical memory is just too small.

llvm/docs/GettingStarted.rst
1101

We could add that if you need a Debug build, you could consider using -DLLVM_USE_SPLIT_DWARF=ON which should ease memory pressure on the linker significantly.

1104

We could also mention PARALLEL_COMPILE_JOBS but it's impact is much less significant.

Meinersbur added a comment.EditedMar 1 2020, 12:23 PM

Additional mitigations we should cover:

  • Use ninja instead of make
  • BUILD_SHARED_LIBS=ON
  • Use the gold linker or lld.
  • Other components that can be disabled, but are ON by default.
  • Separate debug symbols file (-gsplit-dwarf)

Documenting this has been suggested in those threads several times. I think a separate section just covering this problem would be a good idea. Other parts in the manual (such as https://llvm.org/docs/CMake.html#options-and-variables) are references, and my not be where someone would look when someone tries to compile LLVM the first time.

Perhaps this belongs in another patch but I think you bring up a good point that for someone trying to compile for the first time it might be intimidating to see so many options. Perhaps we could add something like LLVM_LIMIT_RESOURCES which will encompass many of these options?

Anything like this without an LLD/Gold reference is likely insufficient. In my experience, switching to Gold or LLD has been the biggest benefit so far.

That said, do we have a version of our build scripts that build LLD and use it from then on out? We might find that despite increasing the build time a bit, it would improve compilation on low-resource machines.

llvm/docs/GettingStarted.rst
1101

While slightly less useful (since you can't as easily debug), a note about release-with-asserts as a good middle ground that compiles roughly as quickly as the normal release build might be valuable.

Kokan added a comment.Mar 2 2020, 7:14 AM

Anything like this without an LLD/Gold reference is likely insufficient. In my experience, switching to Gold or LLD has been the biggest benefit so far.

I did some testing comparing -DLLVM_PARALLEL_LINK_JOBS and -DLLVM_USE_LINKER=lld on linux. (for parallel link using different number of values).
Based on that I would also say that if you have lld the parallel linking is insignificant[1].

On the other hand I would limit the list of "tips" here. (As too much option could cause users not to read it; I've been there). Mostly at 1-3 option I would stop, and possible there could be a different section listing all of the other options.
The first in the list should be LLVM_USE_LINKER imho (which I do not write as an llvm veteran which I am not)

[1] https://reviews.llvm.org/D72402 Note: this is no an advertisement of that patch, which is kinda rotten because of me, just to reference some data in the description.

  • Use ninja instead of make
  • BUILD_SHARED_LIBS=ON

Absolutely, though this should also mention the dynamic library build without BUILD_SHARED_LIBS -- some downstream uses of LLVM have a hard time with BUILD_SHARED_LIBS.

  • Separate debug symbols file (-gsplit-dwarf)

There's a CMake option for that -- LLVM_USE_SPLIT_DWARF.

Also worth mentioning is generating the gdb-index, which makes a ridiculous difference for debugging.

llvm/docs/GettingStarted.rst
1104

Personally, I think using PARALLEL_COMPILE_JOBS is an anti-pattern. You should just tell ninja (or make, but really you should be using ninja) to parallelize the build. Ninja uses all hardware threads by default, and you can easily teach this to make as well.

The reason for setting PARALLEL_LINK_JOBS is to reduce the parallelism during linking to avoid out-of-memory conditions.

  • Use ninja instead of make
  • BUILD_SHARED_LIBS=ON

Absolutely, though this should also mention the dynamic library build without BUILD_SHARED_LIBS -- some downstream uses of LLVM have a hard time with BUILD_SHARED_LIBS.

  • Separate debug symbols file (-gsplit-dwarf)

There's a CMake option for that -- LLVM_USE_SPLIT_DWARF.

Also worth mentioning is generating the gdb-index, which makes a ridiculous difference for debugging.

FWIW you actually /have/ to use gdb-index when using split DWARF with gdb, so far as I know/last I checked. (gdb will assume there's an index, query a trivial/empty one, and just stop/not go searching for things after the failed lookup). It'd be good if someone could fix the LLVM_USE_SPLIT_DWARF to add the linker flag (-Wl,--gdb-index). I /think/ -gsplit-dwarf when targeting gdb already implies -ggnu-pubnames which are needed to build the index in the linker (unless the linker is going to parse the DWARF manually, which is super slow/not great).

e-leclercq updated this revision to Diff 248620.Mar 5 2020, 4:01 PM

Added ninja as recommended build tool, lld linker option, and gold linker option. Also, clarified similarity to the -j option in make, as requested.

abrachet added inline comments.Mar 5 2020, 4:33 PM
llvm/docs/GettingStarted.rst
1099

Should we say the CMake flag for using Ninja is -G Ninja?

1101

These are CMake options but these sentence makes it seem like they are ninja. But we should specify that LLVM_PARALLEL_LINK_JOBS is only meaningful when using Ninja

Added the -G Ninja cmake flag. Mentioned that the -DLLVM_PARALLEL_LINK_JOBS option is only for building with ninja.

e-leclercq marked 6 inline comments as done.

Added release-with-asserts build type under -DCMAKE_BUILD_TYPE.

e-leclercq accepted this revision.Mar 14 2020, 2:21 PM
e-leclercq marked an inline comment as done.
This revision is now accepted and ready to land.Mar 14 2020, 2:21 PM
Meinersbur added inline comments.Mar 14 2020, 3:19 PM
llvm/docs/GettingStarted.rst
1105–1109

LLVM_ENABLE_LLD and LLVM_USE_LINKER are mutually exclusive. In particular, LLVM_ENABLE_LLD=ON is equivalent to LLVM_USE_LINKER=lld. I think "choice of linker" should be just one item.

Could you document what LLVM_USE_LINKER would be for using the gold linker?

1111–1116

Actually, LLVM defaults to CMAKE_BUILD_TYPE=Debug in its CMakeLists.txt:

if (NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)
  message(STATUS "No build type selected, default to Debug")
  set(CMAKE_BUILD_TYPE "Debug" CACHE STRING "Build type (default Debug)" FORCE)
endif()

This is is not generally true for CMake projects, so as a rule of thumb, I personally always define CMAKE_BUILD_TYPE.

For the documentation, we should make aware of CMAKE_BUILD_TYPE=Release (or maybe even MinSizeRel) in contrast to the default Debug. Consider switching LLVM_ENABLE_ASSERTIONS=ON for release builds. It is off by default for non-debug builds.

1119–1121

Could you also mention that link jobs may require much more memory that compile jobs, so limiting the number of parallel link jobs may reduce the peak memory usage?

If memory is limited, I'd also recommend starting with LLVM_PARALLEL_LINK_JOBS=1 and optionally increase it afterwards. Limiting the link jobs does not increase the total build time that much.

1132–1135

A recommendation would be useful for beginners. For instance, LLVM_ENABLE_PROJECTS=clang (instead of LLVM_ENABLE_PROJECT=all) may suffice for many users.

1138

Just "clang static analyzer", no need to captialize.

1141–1143

I have never used this option myself. It would be nice to mention what it does.

See also https://www.productive-cpp.com/improving-cpp-builds-with-split-dwarf/

You should not accept your own patches. Please wait for someone else to greenlight it.

e-leclercq requested review of this revision.Mar 17 2020, 4:12 PM
e-leclercq marked 6 inline comments as done.Mar 17 2020, 9:21 PM
e-leclercq updated this revision to Diff 250986.EditedMar 17 2020, 9:28 PM

Made the recommended changes: explained what the split dwarf option does, included more information on parallel link jobs, replaced the enable lld option with more information under the use linker option, added more information to the build type including the MinSizeRel setting, added some common projects as examples under the enable projects option, and changed the formatting of clang static analyzer, so that it is now correct.

Meinersbur added inline comments.Mar 17 2020, 9:47 PM
llvm/docs/GettingStarted.rst
1103

Maybe also mention that ninja is fester than make, especially when there is not a lot to (re-)build.

1148

Could you also mention the file extension (.dwo)?

e-leclercq marked an inline comment as done.

Added more information to the -G Ninja option and a note about the .dwo extension.

e-leclercq marked an inline comment as done.Mar 18 2020, 12:18 AM
Meinersbur added inline comments.Mar 18 2020, 9:30 AM
llvm/docs/GettingStarted.rst
1103

To clarify: ninja helps most with incremental builds. That is, only some of the files are changes and only some of the build artifacts need to be recreated, often the case in development cycles. The extreme case is that no file has changed, and make/ninja only need to check every file's modified date. make recursively calls itself in every directory, taking a multiple of the time ninja does for the same task.

The wording "if you are rebuilding or building multiple times." does not say that. Why would someone build the same source multiple times?

dim accepted this revision.Mar 18 2020, 1:51 PM

As far as I'm concerned, most suggestions make sense, and it is good to have this information clearly stated.

This revision is now accepted and ready to land.Mar 18 2020, 1:51 PM
This comment was removed by e-leclercq.

Fixed correctness in -G Ninja option

e-leclercq marked an inline comment as done.Mar 27 2020, 12:03 PM
Meinersbur accepted this revision.Mar 28 2020, 2:06 AM
lebedev.ri added inline comments.Mar 28 2020, 2:16 AM
llvm/docs/GettingStarted.rst
1115

I don't think there actually is a "release-with-asserts" build type,
but "relwithdebinfo"?

1123

Why "of course"?

1124–1125

Since lld is internally parallel, i'd even say this should always be 1 if using lld.

Hi Evan,

I pushed this patch to the repository as 37943e518c5. Thanks and congratulations for your first contribution!

This revision was automatically updated to reflect the committed changes.
Meinersbur added inline comments.Mar 28 2020, 2:53 AM
llvm/docs/GettingStarted.rst
1115

What is meant here is -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON.

I don't think relwithdebinfo solves any problem compiling LLVM itself. Files will still be large because they contain debug info.

1124–1125

If this is true, our CMakeLists.txt should set LLVM_PARALLEL_LINK_JOBS whenever LLVM_USE_LINKER is set to lld.

thakis added a subscriber: thakis.Mar 28 2020, 4:26 AM

Thanks for doing this!

I spend a lot of time with llvm's build, so here are some more comments.

llvm/docs/GettingStarted.rst
1110

This is misleading. You should always use lld, even when working on lld. Lld is faster at linking lld than gold is. This should recommend lld over gold more strongly, and possibly not mention gold at all -- I'm aware of no reason to use gold (except possibly if you're on a mips box).

1111–1116

Looks like this got lost. I agree that release builds with LLVM_ENABLE_ASSERTIONS=ON is a good build config, so it makes sense to mention both.

1120

Do you have numbers to back this up? At least for release builds, I haven't seen speedup from this (when using lld to link), on systems with as little as 16 gb if memory.

1124–1125

lld doesn't use that much parallelism, and lld jobs don't run that long. I don't think this should default to 1 with lld (if you hand benchmarks that show it's useful that changes things of course, but that'd surprise me).

1134

I believe this is only useful if you're doing debug builds, which you already advise against.

1144

"significantly" oversells this; it reduced full build time of clang by ~5% last I checked.

1147

Maybe mention that this one is Linux only.

@e-leclercq Would you create a new diff to address the new comments?

llvm/docs/GettingStarted.rst
1110

One reason to not use lld is when building LLVM the first time on a system that does not have lld installed (or available as binary in a package repository).

1134

You need a debug build for debugging, when printf-debugging is not enough.

e-leclercq marked 10 inline comments as done.Mar 31 2020, 12:41 PM

@Meinersbur, should I update the diff here or create a new revision?

Create a new revision.