This CL adds /Zc:DllexportInlines flag to clang-cl.
When Zc:DllexportInlines- is specified, inline class member function is not exported if the function does not have local static variables.
By not exporting inline function, code for those functions are not generated and that reduces both compile time and obj size. Also this flag does not import inline functions from dllimported class if the function does not have local static variables.
On my 24C48T windows10 machine, build performance of chrome target in chromium repository is like below.
These stats are come with 'target_cpu="x86" enable_nacl = false is_component_build=true dcheck_always_on=true` build config and applied
- https://chromium-review.googlesource.com/c/chromium/src/+/1212379
- https://chromium-review.googlesource.com/c/v8/v8/+/1186017
Below stats were taken with this patch applied on https://github.com/llvm-project/llvm-project-20170507/commit/a05115cd4c57ff76b0f529e38118765b58ed7f2e
config | build time | speedup | build dir size |
with patch, PCH on, debug | 1h10m0s | x1.13 | 35.6GB |
without patch, PCH on, debug | 1h19m17s | 49.0GB | |
with patch, PCH off, debug | 1h15m45s | x1.16 | 33.7GB |
without patch, PCH off, debug | 1h28m10s | 52.3GB | |
with patch, PCH on, release | 1h13m13s | x1.22 | 26.2GB |
without patch, PCH on, release | 1h29m57s | 37.5GB | |
with patch, PCH off, release | 1h23m38s | x1.32 | 23.7GB |
without patch, PCH off, release | 1h50m50s | 38.7GB | |
This patch reduced obj size and the number of exported symbols largely, that improved link time too.
e.g. link time stats of blink_core.dll become like below
cold disk cache | warm disk cache | |
with patch, PCH on, debug | 71s | 30s |
without patch, PCH on, debug | 111s | 48s |
This patch's implementation is based on Nico Weber's patch. I modified to support static local variable, added tests and took stats.
Just one more small thing I remembered: please add a test in tests/Driver/cl-options.c to make sure we translate this to the cc1 flag correctly.