Depends on D29695
Details
Diff Detail
- Build Status
Buildable 3767 Build 3767: arc lint + arc unit
Event Timeline
Note: I patched in https://github.com/pcc/llvm-project/commit/5a5904d6721f895eafdd2fc476872b98806c36e6 to measure perf impact. Total wall time spent in addRegularLTO when linking chrome was 8.2743s, as compared to about 6 seconds before (see D27324).
I assume these get dropped in the usual place where we eliminate available externally. Does VCP benefit from any of the other optimizations we would typically apply before that point (e.g. inlining)?
llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp | ||
---|---|---|
282 | Add comment about what these are testing (I think I have correctly reasoned them out, but would be good to be explicit). | |
297 | There are now comments about the eligible virtual funcs being added to the regular LTO, but I see there are no comments about the original condition here for putting a GVar in the regular LTO module - maybe add that or point to someplace that describes what we're doing. | |
llvm/test/Transforms/ThinLTOBitcodeWriter/split-vfunc-internal.ll | ||
10 | Where do the long hex strings come from in the names in this test? |
llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp | ||
---|---|---|
319 | (One-line comment) |
Just did a fresh set of runs on chrome with https://github.com/pcc/llvm-project/commits/lto-timers patched in, and took median of 5 for each measurement:
before after addRegularLTO 8.0407s 9.7432s (+21%) runRegularLTO 9.6064s 10.4748s (+9%)
Note that we haven't done everything we can -- a couple more things I can think of are to exclude functions that take non-integer arguments (other than "this"), and to set optnone on functions in the merged module. Taking a look at chrome's merged module, there are 6727 available_externally functions, of which 1340 take a pointer argument other than "this".
Yes (i.e. at the end of the regular LTO pipeline).
Does VCP benefit from any of the other optimizations we would typically apply before that point (e.g. inlining)?
I suppose that is possible in principle, but I wouldn't typically expect it to; I've found that VCP eligible functions tend to be relatively trivial functions that are fully inlined into leaf functions at compile time.
llvm/test/Transforms/ThinLTOBitcodeWriter/split-vfunc-internal.ll | ||
---|---|---|
10 | They are module IDs. See getModuleId in ThinLTOBitcodeWriter. |
Another data point for chrome is that this change causes us to merge 53487 struct type definitions, most of which are likely unnecessary (I think they are either associated with "this", which we make sure is unused, or one of the other pointer arguments, which means we don't need the function). So something else we could do that would be a little more tricky is to try to drop the type information for "this". (Of course, this would happen for free if we had typeless pointers.)
Quite interesting that we take more time to build and merge the IR than transforming and emitting code :)
llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp | ||
---|---|---|
278–279 | great comments! |
great comments!