This is an archive of the discontinued LLVM Phabricator instance.

[CMake] Default options for faster executables on MSVC
ClosedPublic

Authored by aganea on Nov 29 2018, 8:25 AM.

Details

Summary

This patch:

  • Disables incremental linking by default. /INCREMENTAL adds extra thunks in the EXE, which makes execution slower.
  • Sets /MT (static CRT lib) by default instead of CMake's default /MD (dll CRT lib). The previous default /MD makes all DLL functions to be thunked, thus making execution slower (memcmp, memset, etc.)
  • Adds LLVM_ENABLE_INCREMENTAL_LINK which is set to OFF by default.

DLL thunks:

/INCREMENTAL thunks:

I thought it'd be better if Clang / LLVM / LLD would be fast out-of-the-box, instead of people having to dig for "good" settings.

Below are timings for linking a large DLL with LLD. The type merging pass is dominant, and is affected by those two options, because of the tight merge loop. I currently have several optimisations in that loop that tend to increase the gap.

Without this patch (default cmake settings):

Type Merging:           23174 ms ( 56.9%)

With this patch (new default cmake settings):

Type Merging:           22338 ms ( 52.2%)

This is a MSVC-only change. Tested with VS2017 15.9.1 (MSVC 19.16.27023.1 x64)

Diff Detail

Repository
rL LLVM

Event Timeline

aganea created this revision.Nov 29 2018, 8:25 AM
aganea edited the summary of this revision. (Show Details)Nov 29 2018, 8:26 AM

So, one thing to keep in mind is that RelWithDebInfo is not what it looks like. If you are like me, you might think that it enables the same optimizations as a Release build and enables debug info. Unfortunately, that's incorrect. You've noticed that it uses the /INCREMENTAL link flag, but more importantly, it adds the /Ob1 compile flag, which forbids the compiler from inlining functions not specifically marked inline. This means there are *significant* performance differences between a plain Release build and a RelWithDebInfo build, so it's not useful for profiling. To deal with this, @zturner added the LLVM_ENABLE_PDB option.

I suppose if we want we can go down this path of trying to take things into our own hands and overriding the cmake defaults, but it seemed better to sidestep the issue and keep building in the Release configuration, which already passes /INCREMENTAL:NO. But, maybe if you're building from the IDE, I could see why you'd prefer to make RelWithDebInfo behave the way we expect it to.

llvm/trunk/cmake/modules/ChooseMSVCCRT.cmake
70–71 ↗(On Diff #175858)

Isn't it a bad practice to statically link the CRT into DLLs? We have a few (LLDB), and this will get picked up by it. However, I notice that we set this option in the official build_llvm_package.bat script that @hans uses to make the Windows releases:
https://github.com/llvm-git-prototype/llvm/blob/master/llvm/utils/release/build_llvm_package.bat#L47

Anyway, this is just the default, and it's reasonable for people building clang, lld, and other static executables, which is most people.

llvm/trunk/cmake/modules/HandleLLVMOptions.cmake
387 ↗(On Diff #175858)

Looks like it needs one more indentation level to be in the loop over target types. We don't have a formatter tool for cmake, but it would be easier to read if this was broken up onto multiple lines.

zturner added inline comments.Nov 30 2018, 11:19 AM
llvm/trunk/cmake/modules/HandleLLVMOptions.cmake
387 ↗(On Diff #175858)

Instead of replacing all the options in an existing command line, why don't we just have something like:

if (CMAKE_CXX_COMPILER_ID MATCHES "MSVC" AND NOT LLVM_ENABLE_INCREMENTAL_LINK AND CMAKE_BUILD_TYPE != "Debug")
  append("/INCREMENTAL:NO" CMAKE_LINKER_FLAGS)
endif()
aganea updated this revision to Diff 176179.Nov 30 2018, 12:15 PM
aganea marked 3 inline comments as done.
aganea added a reviewer: hans.

Indent.

llvm/trunk/cmake/modules/ChooseMSVCCRT.cmake
70–71 ↗(On Diff #175858)

Isn't it a bad practice to statically link the CRT into DLLs?

Yes, only if you dynamically load (third-party) DLLs in your executable at runtime (plugins say) The EXE and DLLs might be compiled with a different CRTs and you might end up accessing wrong class/structure layouts in memory.

However only LTO seems to be compiled as a DLL. I'm not sure why, and how it's used? Is it supposed to be dynamically loaded by LLD?

llvm/trunk/cmake/modules/HandleLLVMOptions.cmake
387 ↗(On Diff #175858)

(note: CMAKE_LINKER_FLAGS doesn't exist, rather CMAKE_EXE_LINKER_FLAGS CMAKE_MODULE_LINKER_FLAGS CMAKE_SHARED_LINKER_FLAGS CMAKE_STATIC_LINKER_FLAGS)

I've tried several approches including the one you suggest, however replacing the flag (REGEX REPLACE) is the only one that works with my CMake 3.12.2. Probably because how /INCREMENTAL is forcibly set by CMake, and probably because how flags are then rendered into the .vcxproj by the generator.

In D55056#1314996, @rnk wrote:

So, one thing to keep in mind is that RelWithDebInfo is not what it looks like. If you are like me, you might think that it enables the same optimizations as a Release build and enables debug info. Unfortunately, that's incorrect. You've noticed that it uses the /INCREMENTAL link flag, but more importantly, it adds the /Ob1 compile flag, which forbids the compiler from inlining functions not specifically marked inline. This means there are *significant* performance differences between a plain Release build and a RelWithDebInfo build, so it's not useful for profiling. To deal with this, @zturner added the LLVM_ENABLE_PDB option.

I suppose if we want we can go down this path of trying to take things into our own hands and overriding the cmake defaults, but it seemed better to sidestep the issue and keep building in the Release configuration, which already passes /INCREMENTAL:NO. But, maybe if you're building from the IDE, I could see why you'd prefer to make RelWithDebInfo behave the way we expect it to.

Currently RelWithDebInfo is "the cake is a lie". It isn't exactly what you expect, and users might loose long hours (like I did) understanding why the build is slower, when compared to Release. You want RelWithDebInfo because you want to debug crashes in production (I wasn't aware of LLVM_ENABLE_PDB, thanks!).
This patch makes RelWithDebInfo almost equivalent to Release+PDB, except for /Ob1 (which we could also change to /Ob2 in RelWithDebInfo). In the past /Ob2 was harder to debug with MSVC; but since the addition of /d2Zi+ slash /Zo, optimized builds suddenly have friendlier debug streams.

rnk accepted this revision.Nov 30 2018, 1:50 PM

I just chatted with @zturner, and we came to the conclusion that if you want to make RelWithDebInfo have the exact same optimizations as Release, we'd be in favor. Right now it's just a trap. I think the optimization level may be different on non-Windows platforms, maybe -O3 vs -O2 or -Os, but I've forgotten.

The MinSizeRel changes also seem reasonable.

llvm/trunk/cmake/modules/ChooseMSVCCRT.cmake
70–71 ↗(On Diff #175858)

I believe it is. I think Sony uses it for their linker, which runs on Windows. I'm not aware of other users. LLD links LLVM statically (bloating the release package, but that's another story).

LLDB is also a Python module, which is basically a DLL. So, these are kind of bad defaults for those projects, but I think they are in the minority and whoever is building them can override this default. /MT is a good default for our released packages anyway, since it makes them more redistributable.

This revision is now accepted and ready to land.Nov 30 2018, 1:50 PM
aganea added a comment.Dec 1 2018, 6:28 AM
In D55056#1315222, @rnk wrote:

I just chatted with @zturner, and we came to the conclusion that if you want to make RelWithDebInfo have the exact same optimizations as Release, we'd be in favor. Right now it's just a trap. I think the optimization level may be different on non-Windows platforms, maybe -O3 vs -O2 or -Os, but I've forgotten.

The MinSizeRel changes also seem reasonable.

In that case can’t we just keep Debug and Release+PDB, and drop the two others?

rnk added a comment.Dec 3 2018, 11:17 AM

In that case can’t we just keep Debug and Release+PDB, and drop the two others?

I guess I'm not sure what you mean. Do you mean, drop them, as in find a way to remove them from the list of default configurations so that they don't display in IDEs as a configuration option? That makes sense. But I imagine that IDE users would want some configuration for "Release+PDB" that doesn't require re-running cmake, and RelWithDebInfo seems like the right name for it.

In D55056#1317117, @rnk wrote:

In that case can’t we just keep Debug and Release+PDB, and drop the two others?

I guess I'm not sure what you mean. Do you mean, drop them, as in find a way to remove them from the list of default configurations so that they don't display in IDEs as a configuration option? That makes sense. But I imagine that IDE users would want some configuration for "Release+PDB" that doesn't require re-running cmake, and RelWithDebInfo seems like the right name for it.

The other way around: why would you keep Release without debug info?

aganea added a comment.EditedDec 3 2018, 11:24 AM

The point being, I'd keep Debug & Release targets; drop RelWithDebInfo & MinSizeRel; and add debugging info (LLVM_ENABLE_PDB) to the Release target.

Some people like to build Release without debug info because debug info slows down the build time. Personally I think debug info should be always-on and we should use the increased build time as a way to motivate people to submit patches to make debug info generation faster, but that's just me :)

rnk added a comment.Dec 3 2018, 1:15 PM

The point being, I'd keep Debug & Release targets; drop RelWithDebInfo & MinSizeRel; and add debugging info (LLVM_ENABLE_PDB) to the Release target.

Makes sense, but LLVM_ENABLE_PDB doesn't make sense on non-Windows OSs. We also have to consider that linking with debug info enabled on Linux is painfully slow, so it might make more sense to make it off by default. I learned to build in Release+Asserts on Linux specifically to speed up the build. As slow as it is to make PDBs with LLD, I promise you dsymutil and gold+objcopy are slower. =S

So, after considering all that, maybe it is best to have MSVC and clang-cl build diverge from the other platforms. If we think making PDBs by default is fast enough, then let's just enable them for all configurations and get rid of MinSizeRel and RelWithDebInfo.

aganea added a comment.Dec 3 2018, 1:23 PM
In D55056#1317377, @rnk wrote:

Makes sense, but LLVM_ENABLE_PDB doesn't make sense on non-Windows OSs. We also have to consider that linking with debug info enabled on Linux is painfully slow, so it might make more sense to make it off by default. I learned to build in Release+Asserts on Linux specifically to speed up the build. As slow as it is to make PDBs with LLD, I promise you dsymutil and gold+objcopy are slower. =S

How do you debug Release builds if you need to? (on Linux let's say) Do you activate debug info only for the TU that needs debugging?

So, after considering all that, maybe it is best to have MSVC and clang-cl build diverge from the other platforms. If we think making PDBs by default is fast enough, then let's just enable them for all configurations and get rid of MinSizeRel and RelWithDebInfo.

link.exe is still slow, especially if /INCREMENTAL is disabled. Before going any further on discussing this, I'll do a round of compilation tests to compare with/without MinSizeRel/RelWithDebInfo.

This revision was automatically updated to reflect the committed changes.

This change caused a number of failures in the LLDB tests (as in 40+ failures) on Windows:

http://lab.llvm.org:8014/builders/lldb-x64-windows-ninja/builds/2404/steps/test/logs/stdio

It looks like LLDB is occasionally failing to execute with exception 0xC0000409 (STATUS_STACK_BUFFER_OVERRUN). I think we should revert the change until we can correctly update the CMake files for LLDB unless it can be fixed quickly.

Reverted in r349654. I will take a look at the LLDB tests.

My bad - I reverted the wrong thing. Reverted for good in r349656.

dmajor added a subscriber: dmajor.Feb 25 2019, 7:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 25 2019, 7:58 AM
aganea updated this revision to Diff 200578.May 22 2019, 5:33 PM

Reopening with fixes for LLDB.

This build break was caused by the liblldb.dll used along with /MT (static CRT) which is usually not advisable. lldb.exe was initializing a handle for the stdout stream (using _pipe) and consuming it in liblldb.dll. /MT evidently creates internal states that are different across module bounds (exe, dll), which causes the stdout handle (__pioinfo in this case) to point to separate copies in memory.

aganea reopened this revision.May 22 2019, 5:33 PM
This revision is now accepted and ready to land.May 22 2019, 5:33 PM
cmake/modules/ChooseMSVCCRT.cmake
72

As far as I understand, /MT is being disabled because it is required by LLDB that it be disabled. Is that correct? If that's the case, I think the message needs to change to be clearer. For example: "Disabling /MT *as* required by LLDB". Right now it reads like /MT is being disabled but it is required by LLDB.

aganea updated this revision to Diff 201013.May 23 2019, 10:49 AM
aganea marked 2 inline comments as done.
aganea added a reviewer: thakis.
aganea added inline comments.
cmake/modules/ChooseMSVCCRT.cmake
72

As requested: "Disabling /MT as required by LLDB"

rnk added a comment.May 23 2019, 11:50 AM

Previously I passed LLVM_USE_CRT_RELEASE=MT explicitly as part of my build, and I ran into the same problems with LLDB that you did. I'll try to modify this once it lands to warn when the user tries to do that. At the time, I couldn't see how to do it easily.

In D55056#1514451, @rnk wrote:

Previously I passed LLVM_USE_CRT_RELEASE=MT explicitly as part of my build, and I ran into the same problems with LLDB that you did. I'll try to modify this once it lands to warn when the user tries to do that. At the time, I couldn't see how to do it easily.

I can change this to (also) display the message when LLVM_USE_CRT_* is passed explicitly.

aganea updated this revision to Diff 201044.EditedMay 23 2019, 12:56 PM

If using -DLLVM_TOOL_LLDB_BUILD it now displays "Using /MD as required by LLDB."
If using -DLLVM_TOOL_LLDB_BUILD -DLLVM_USE_CRT_RELEASE=MT it displays "Disabling /MT as required by LLDB."
Otherwise /MT is the new default.

aganea requested review of this revision.May 23 2019, 12:56 PM
rnk accepted this revision.May 23 2019, 1:26 PM

lgtm

This revision is now accepted and ready to land.May 23 2019, 1:26 PM
This revision was automatically updated to reflect the committed changes.
dmajor added a comment.Oct 1 2019, 6:22 AM

For posterity: this was reverted in r361837.