This is an archive of the discontinued LLVM Phabricator instance.

/redundancyReport proof of concept
Needs ReviewPublic

Authored by thakis on Aug 8 2017, 12:28 PM.

Details

Reviewers
ruiu
Summary

I don't think we want to check this in, but maybe lld should have a feature that's kind of like this one [1].

During a C++ compilation, inline functions can be codegen'd into many .o files, only to be then merged by the linker. If a class is dllexported, _all_ its inline member functions get codegen'd into _every_ TU that includes the header declaring the class. This generates lots of redundant work that can often be saved fairly easily, but there's currently no good way to ask the linker which inline functions have the "most" redundancy. This patch adds a "/redundancyReport:file" flag to lld-link. If present, lld-link will write a line with the three columns containing "total number of redundantly codegen'd bytes due to function, number of times function was present in linked-in obj files, name of function" for each function, sorted by total redundancy.

With some python scripts doing pretty heavy postprocessing of lld-link's output, I was able to reduce link.exe's memory consumption for linking blink_core.dll by 600MB / 6%, and the size of blink_core.ilk (which link.exe uses to do incremental linking) by 20%, see https://bugs.chromium.org/p/chromium/issues/detail?id=560475#c60 and the following 10 or so comments. Since so much post processing was needed, we probably don't want this feature, but maybe something like it. So just putting this out there in case someone else has a good idea upon seeing it.

1: https://www.youtube.com/watch?v=COCmaZA3d08

Diff Detail

Event Timeline

thakis created this revision.Aug 8 2017, 12:28 PM
thakis updated this revision to Diff 111043.Aug 14 2017, 11:55 AM
thakis updated this revision to Diff 111044.Aug 14 2017, 11:58 AM
thakis edited the summary of this revision. (Show Details)
thakis edited the summary of this revision. (Show Details)Aug 14 2017, 12:01 PM
thakis edited the summary of this revision. (Show Details)Aug 14 2017, 12:05 PM
thakis added a reviewer: ruiu.
thakis added subscribers: llvm-commits, pcc, rnk.
ruiu edited edge metadata.Aug 14 2017, 12:54 PM

This is interesting. If we could make the output of this feature usable, and if the feature is useful for other applications than chromium, it might be useful to add this feature to the linker. I'd create a new subdirectory analysis under COFF and put this code there.

pcc added a comment.Aug 14 2017, 1:10 PM

I'm wondering whether .ilk size is mostly a factor of the #sections rather than the size of each section. If so, it may be simpler to create the redundancy report by counting the number of "comdat discarded" messages.

Neat!

I've done this without linker support (scraping objdump -s output, glomming it together with python, etc) to assess the impact of modular codegen, which can remove some of this redundancy. Be happy to give this version a whirl.

Does this measure redundancy in objects that don't end up linked? (eg: a library dependency where some of the objects go unused) Maybe not a big deal, perhaps their distribution of redundancy would be about the same as the rest of the population, so it wouldn't skew the results much...

thakis edited the summary of this revision. (Show Details)May 22 2019, 5:42 PM
thakis edited the summary of this revision. (Show Details)May 22 2019, 5:42 PM