This is an archive of the discontinued LLVM Phabricator instance.

[opt] Devirtualize the SymbolBody type hierarchy and start compacting its members into the base class.
ClosedPublic

Authored by chandlerc on Jun 27 2015, 4:04 AM.

Details

Summary

First, to help motivate this kind of change, understand that in
a self-link, LLD creates 5.5 million defined regular symbol bodies (and
6 million symbol bodies total). A significant portion of its time is
spent allocating the memory for these symbols, and befor ethis patch
the defined regular symbol body objects alone consumed some 420mb of
memory during the self link.

As a consequence, I think it is worth expending considerable effort to
make these objects as memory efficient as possible. This is the first of
several components of that. This change starts with the goal of removing
the virtual functins from SymbolBody so that it can avoid having a vptr
embedded in it when it already contains a "kind" member, and that member
can be much more compact than a vptr.

The primary way of doing this is to sink as much of the logic that we
would have to dispatch for into data in the base class. As part of this,
I made the various flags bits that will pack into a bitfield with the
kind tag. I also sank the Name down to eliminate the dispatch for that,
and used LLVM's RTTI-style dispatch for everything else (most of which
is cold and so doesn't matter terribly if we get minutely worse lowering
than a vtable dispatch).

As I was doing this, I wanted to make the RTTI-dispatch (which would
become much hotter than before) as efficient as possible, so I've
re-organized the tags somewhat. Notably, the common case (regular
defined symbols) is now zero which we can test for faster.

I also needed to rewrite the comparison routine used during resolving
symbols. This proved to be quite complex as the semantics of the
existing one were very subtle due to the back-and-forth virtual dispatch
caused by re-dispatching with reversed operands. I've consolidated it to
a single function and tried to comment it quite a bit more to help
explain what is going on. However, this may need more comments or other
explanations. It at least passes all the regression tests. I'm not
working on Windows, so I can't fully test it.

With all of these changes, the size of a DefinedRegular symbol on
a 64-bit build goes from 80 bytes to 64 bytes, and we save approximately
84mb or 20% of the memory consumed by these symbol bodies during the
link.

The link time appears marginally faster as well, and the profile hotness
of the memory allocation subsystem got a bit better, but there is still
a lot of allocation traffic.

Diff Detail

Repository
rL LLVM

Event Timeline

chandlerc updated this revision to Diff 28628.Jun 27 2015, 4:04 AM
chandlerc retitled this revision from to [opt] Devirtualize the SymbolBody type hierarchy and start compacting its members into the base class..
chandlerc updated this object.
chandlerc edited the test plan for this revision. (Show Details)
chandlerc added a reviewer: ruiu.
chandlerc added a subscriber: Unknown Object (MLST).
ruiu added inline comments.Jun 27 2015, 12:40 PM
COFF/Symbols.cpp
46–48 ↗(On Diff #28628)

Move this before "if (LK != RK)" to reduce nesting level a bit.

145–146 ↗(On Diff #28628)

getRVA should never be called for bitcode symbols. You can remove this "case" and DefinedBitcode::getRVA.

168–169 ↗(On Diff #28628)

Ditto

170–171 ↗(On Diff #28628)

This can also be removed.

COFF/Symbols.h
97–104 ↗(On Diff #28628)

Does the use of bitfield here save space? I guess StringRef is 8 byte aligned on 64 bit platform, so we have 8 bytes for attributes. We have four fields, so we can use 1 byte for each.

98 ↗(On Diff #28628)

Can you use C++11 initializer?

unsigned IsExternal : 1 = true;
114 ↗(On Diff #28628)

Add explicit.

130 ↗(On Diff #28628)

All other classes starts with Defined, so name this DefinedObject, DefinedFile or DefinedCOFF.

131 ↗(On Diff #28628)

Why do you need this?

163 ↗(On Diff #28628)

Remove blank line.

188 ↗(On Diff #28628)

Do we need this?

chandlerc updated this revision to Diff 28710.Jun 29 2015, 2:14 PM

Updating to reflect changes from code review.

I think I've addressed everything. PTAL.

COFF/Symbols.cpp
46–48 ↗(On Diff #28628)

Done.

145–146 ↗(On Diff #28628)

Sure, I was letting the inliner do it, I'll do it directly and give a better message.

168–169 ↗(On Diff #28628)

Done.

170–171 ↗(On Diff #28628)

Similar to the above, done.

COFF/Symbols.h
97–104 ↗(On Diff #28628)

The very next flag added would push us over 4 bytes though, and that would waste space on a 32-bit system. I'd like to pick a design that makes it easy to extend without having dramatic changes in efficiency.

98 ↗(On Diff #28628)

I tried and it didn't compile. =/

114 ↗(On Diff #28628)

Done.

130 ↗(On Diff #28628)

Done.

131 ↗(On Diff #28628)

To implement lazy getName and getDebugName.

163 ↗(On Diff #28628)

Sure.

188 ↗(On Diff #28628)

Yes, it's used to implement compare.

ruiu accepted this revision.Jun 29 2015, 2:24 PM
ruiu edited edge metadata.

LGTM

This revision is now accepted and ready to land.Jun 29 2015, 2:24 PM
This revision was automatically updated to reflect the committed changes.