I added a list of options to configure should someone have issues with long build time or running out of memory. This was added under common problems in the getting started section of the documentation.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I think this also should talk about make vs ninja generator, and using at least ld.gold or even lld.
llvm/docs/GettingStarted.rst | ||
---|---|---|
1104 | Should mention that this is ninja specific. |
Issues with compile time and memory requirements come up regularly, so we though it would be a good idea to document mitigations. For instance:
https://lists.llvm.org/pipermail/llvm-dev/2020-January/137883.html
https://lists.llvm.org/pipermail/llvm-dev/2019-June/133506.html
http://lists.llvm.org/pipermail/llvm-dev/2019-November/137226.html
https://reviews.llvm.org/D72402
Documenting this has been suggested in those threads several times. I think a separate section just covering this problem would be a good idea. Other parts in the manual (such as https://llvm.org/docs/CMake.html#options-and-variables) are references, and my not be where someone would look when someone tries to compile LLVM the first time.
Thanks for this! I was thinking about adding this as well because we get failed build emails to the list all the time.
We could recommend using -DLLVM_ENABLE_LLD=ON or gold if they don't have lld. Also -DBUILD_SHARED_LIBS=ON should help with memory a lot too.
Some people try to compile with very little memory, and that's what ultimately kills these builds. I think we should mention that there is a theoretical limit to how small of a memory footprint we can create with these options and sometimes they will be forced to increase their swap space, if their physical memory is just too small.
llvm/docs/GettingStarted.rst | ||
---|---|---|
1101 | We could add that if you need a Debug build, you could consider using -DLLVM_USE_SPLIT_DWARF=ON which should ease memory pressure on the linker significantly. | |
1104 | We could also mention PARALLEL_COMPILE_JOBS but it's impact is much less significant. |
Additional mitigations we should cover:
- Use ninja instead of make
- BUILD_SHARED_LIBS=ON
- Use the gold linker or lld.
- Other components that can be disabled, but are ON by default.
- Separate debug symbols file (-gsplit-dwarf)
Perhaps this belongs in another patch but I think you bring up a good point that for someone trying to compile for the first time it might be intimidating to see so many options. Perhaps we could add something like LLVM_LIMIT_RESOURCES which will encompass many of these options?
Anything like this without an LLD/Gold reference is likely insufficient. In my experience, switching to Gold or LLD has been the biggest benefit so far.
That said, do we have a version of our build scripts that build LLD and use it from then on out? We might find that despite increasing the build time a bit, it would improve compilation on low-resource machines.
llvm/docs/GettingStarted.rst | ||
---|---|---|
1101 | While slightly less useful (since you can't as easily debug), a note about release-with-asserts as a good middle ground that compiles roughly as quickly as the normal release build might be valuable. |
Anything like this without an LLD/Gold reference is likely insufficient. In my experience, switching to Gold or LLD has been the biggest benefit so far.
I did some testing comparing -DLLVM_PARALLEL_LINK_JOBS and -DLLVM_USE_LINKER=lld on linux. (for parallel link using different number of values).
Based on that I would also say that if you have lld the parallel linking is insignificant[1].
On the other hand I would limit the list of "tips" here. (As too much option could cause users not to read it; I've been there). Mostly at 1-3 option I would stop, and possible there could be a different section listing all of the other options.
The first in the list should be LLVM_USE_LINKER imho (which I do not write as an llvm veteran which I am not)
[1] https://reviews.llvm.org/D72402 Note: this is no an advertisement of that patch, which is kinda rotten because of me, just to reference some data in the description.
Absolutely, though this should also mention the dynamic library build without BUILD_SHARED_LIBS -- some downstream uses of LLVM have a hard time with BUILD_SHARED_LIBS.
- Separate debug symbols file (-gsplit-dwarf)
There's a CMake option for that -- LLVM_USE_SPLIT_DWARF.
Also worth mentioning is generating the gdb-index, which makes a ridiculous difference for debugging.
llvm/docs/GettingStarted.rst | ||
---|---|---|
1104 | Personally, I think using PARALLEL_COMPILE_JOBS is an anti-pattern. You should just tell ninja (or make, but really you should be using ninja) to parallelize the build. Ninja uses all hardware threads by default, and you can easily teach this to make as well. The reason for setting PARALLEL_LINK_JOBS is to reduce the parallelism during linking to avoid out-of-memory conditions. |
FWIW you actually /have/ to use gdb-index when using split DWARF with gdb, so far as I know/last I checked. (gdb will assume there's an index, query a trivial/empty one, and just stop/not go searching for things after the failed lookup). It'd be good if someone could fix the LLVM_USE_SPLIT_DWARF to add the linker flag (-Wl,--gdb-index). I /think/ -gsplit-dwarf when targeting gdb already implies -ggnu-pubnames which are needed to build the index in the linker (unless the linker is going to parse the DWARF manually, which is super slow/not great).
Added ninja as recommended build tool, lld linker option, and gold linker option. Also, clarified similarity to the -j option in make, as requested.
Added the -G Ninja cmake flag. Mentioned that the -DLLVM_PARALLEL_LINK_JOBS option is only for building with ninja.
llvm/docs/GettingStarted.rst | ||
---|---|---|
1105–1109 | LLVM_ENABLE_LLD and LLVM_USE_LINKER are mutually exclusive. In particular, LLVM_ENABLE_LLD=ON is equivalent to LLVM_USE_LINKER=lld. I think "choice of linker" should be just one item. Could you document what LLVM_USE_LINKER would be for using the gold linker? | |
1111–1116 | Actually, LLVM defaults to CMAKE_BUILD_TYPE=Debug in its CMakeLists.txt: if (NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES) message(STATUS "No build type selected, default to Debug") set(CMAKE_BUILD_TYPE "Debug" CACHE STRING "Build type (default Debug)" FORCE) endif() This is is not generally true for CMake projects, so as a rule of thumb, I personally always define CMAKE_BUILD_TYPE. For the documentation, we should make aware of CMAKE_BUILD_TYPE=Release (or maybe even MinSizeRel) in contrast to the default Debug. Consider switching LLVM_ENABLE_ASSERTIONS=ON for release builds. It is off by default for non-debug builds. | |
1119–1121 | Could you also mention that link jobs may require much more memory that compile jobs, so limiting the number of parallel link jobs may reduce the peak memory usage? If memory is limited, I'd also recommend starting with LLVM_PARALLEL_LINK_JOBS=1 and optionally increase it afterwards. Limiting the link jobs does not increase the total build time that much. | |
1132–1135 | A recommendation would be useful for beginners. For instance, LLVM_ENABLE_PROJECTS=clang (instead of LLVM_ENABLE_PROJECT=all) may suffice for many users. | |
1138 | Just "clang static analyzer", no need to captialize. | |
1141–1143 | I have never used this option myself. It would be nice to mention what it does. See also https://www.productive-cpp.com/improving-cpp-builds-with-split-dwarf/ |
You should not accept your own patches. Please wait for someone else to greenlight it.
Made the recommended changes: explained what the split dwarf option does, included more information on parallel link jobs, replaced the enable lld option with more information under the use linker option, added more information to the build type including the MinSizeRel setting, added some common projects as examples under the enable projects option, and changed the formatting of clang static analyzer, so that it is now correct.
llvm/docs/GettingStarted.rst | ||
---|---|---|
1103 | To clarify: ninja helps most with incremental builds. That is, only some of the files are changes and only some of the build artifacts need to be recreated, often the case in development cycles. The extreme case is that no file has changed, and make/ninja only need to check every file's modified date. make recursively calls itself in every directory, taking a multiple of the time ninja does for the same task. The wording "if you are rebuilding or building multiple times." does not say that. Why would someone build the same source multiple times? |
As far as I'm concerned, most suggestions make sense, and it is good to have this information clearly stated.
Hi Evan,
I pushed this patch to the repository as 37943e518c5. Thanks and congratulations for your first contribution!
llvm/docs/GettingStarted.rst | ||
---|---|---|
1115 | What is meant here is -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON. I don't think relwithdebinfo solves any problem compiling LLVM itself. Files will still be large because they contain debug info. | |
1124–1125 | If this is true, our CMakeLists.txt should set LLVM_PARALLEL_LINK_JOBS whenever LLVM_USE_LINKER is set to lld. |
Thanks for doing this!
I spend a lot of time with llvm's build, so here are some more comments.
llvm/docs/GettingStarted.rst | ||
---|---|---|
1110 | This is misleading. You should always use lld, even when working on lld. Lld is faster at linking lld than gold is. This should recommend lld over gold more strongly, and possibly not mention gold at all -- I'm aware of no reason to use gold (except possibly if you're on a mips box). | |
1111–1116 | Looks like this got lost. I agree that release builds with LLVM_ENABLE_ASSERTIONS=ON is a good build config, so it makes sense to mention both. | |
1120 | Do you have numbers to back this up? At least for release builds, I haven't seen speedup from this (when using lld to link), on systems with as little as 16 gb if memory. | |
1124–1125 | lld doesn't use that much parallelism, and lld jobs don't run that long. I don't think this should default to 1 with lld (if you hand benchmarks that show it's useful that changes things of course, but that'd surprise me). | |
1134 | I believe this is only useful if you're doing debug builds, which you already advise against. | |
1144 | "significantly" oversells this; it reduced full build time of clang by ~5% last I checked. | |
1147 | Maybe mention that this one is Linux only. |
@e-leclercq Would you create a new diff to address the new comments?
llvm/docs/GettingStarted.rst | ||
---|---|---|
1110 | One reason to not use lld is when building LLVM the first time on a system that does not have lld installed (or available as binary in a package repository). | |
1134 | You need a debug build for debugging, when printf-debugging is not enough. |
Should we say the CMake flag for using Ninja is -G Ninja?