This is an archive of the discontinued LLVM Phabricator instance.

[COFF] Reduce the size of Chunk and SectionChunk, NFC
ClosedPublic

Authored by rnk on Mar 25 2019, 1:20 PM.

Details

Summary

Reorder the fields in both to use padding more efficiently, and add more
comments on the purpose of the fields.

Replace std::vector<SectionChunk*> AssociativeChildren with a
singly-linked list. This avoids the separate vector allocation to list
associative children, and shrinks the 3 pointers used for the typically
empty vector down to 1.

In the end, this reduces the sum of heap allocations used to link
browser_tests.exe with NO PDB by 13.10%, going from 2,248,728 KB to
1,954,071 KB of heap. These numbers exclude memory mapped files, which
are of course a significant factor in LLD's memory usage.

Event Timeline

rnk created this revision.Mar 25 2019, 1:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 25 2019, 1:20 PM
Herald added a subscriber: jdoerfert. · View Herald Transcript
rnk updated this revision to Diff 192193.Mar 25 2019, 1:28 PM
  • Fix -Wreorder
ruiu added inline comments.Mar 25 2019, 1:31 PM
lld/COFF/Chunks.cpp
47

This might be useful but at the same time it looks a bit overly cautious? Perhaps the symbol size is more important but we don't have something like this for them, for example.

rnk marked an inline comment as done.Mar 25 2019, 1:41 PM
rnk added inline comments.
lld/COFF/Chunks.cpp
47

Well, we don't have checks for symbol size yet. :)

Given that we don't have performance monitoring, I really want people to think hard before they casually add another field to SectionChunk. I wouldn't insist on it if we did, but these types of static_asserts have proven useful in LLVM for preventing size creep.

ruiu added inline comments.Mar 25 2019, 1:47 PM
lld/COFF/Chunks.cpp
47

Then maybe reiterating everything again, how about checking directly with a number like static_assert(sizeof(Chunk) == 48)? This should suffice to prevent making the struct larger by accident.

rnk marked an inline comment as done.Mar 25 2019, 1:52 PM
rnk added inline comments.
lld/COFF/Chunks.cpp
47

It would be wrong for 32-bit. Rather than trying to express it as math on sizeof(void*), I think the struct makes it clearer. And it helps document hidden members like vptr.

@rnk wrote:
This reduces the total size of all LLD heap allocations by 5.22%

What is the total heap size before/after? What percentage of that heap size is used by Chunks? How many of them do you have?

lld/COFF/Chunks.cpp
79

Is std::vector<SectionChunk *>'s implementation guaranteed to be 3 pointers? Wouldn't something like AlignedCharArrayUnion<std::vector<SectionChunk *>> be better? Just wondering.

lld/COFF/Chunks.h
127

64-bit for a value that's probably <10 most of the time, seems like an awful waste of space ;-)

If this takes so much memory, it'd be interesting to investigate (in terms of performance and dynamic value range) if some of the pointer members below could be converted to offset / bitfields.

ruiu added inline comments.Mar 25 2019, 1:59 PM
lld/COFF/Chunks.cpp
47

Yeah, but it still feels weird to me to repeat all the members again in a different file just for the purpose of assertion. For 32-bit, we can just set the upper bound like static_assert(sizeof(Chunk) <= 48) where 48 is the size of the struct in 64-bit.

rnk added a comment.Mar 25 2019, 2:53 PM

@rnk wrote:
This reduces the total size of all LLD heap allocations by 5.22%

What is the total heap size before/after? What percentage of that heap size is used by Chunks? How many of them do you have?

I used the VS memory profiling tool and used the type view to see the most commonly allocated types. "void" just means allocations whose type wasn't determined, and a lot of it comes from vectors, specifically the new RelocTargets vector @mstorsjo added.

Before:
heap usage: 176,077 KB
type                                    count           size
lld-link.exe!lld::coff::SectionChunk    370,809         71,195,328
void                                    251,990         63,254,440
lld-link.exe!lld::coff::Symbol[]        444,020         24,865,120
lld-link.exe!lld::coff::DefinedRegular  379,729         18,226,992
char[]  249     1,233,585

After:
heap usage: 167,335 KB
type                                   |count          |byte size
---------------------------------------|---------------|-------------
void                                   |251,990        |63,254,440
lld-link.exe!lld::coff::SectionChunk   |370,809        |62,295,912
lld-link.exe!lld::coff::Symbol[]       |444,020        |24,865,120
lld-link.exe!lld::coff::DefinedRegular |379,729        |18,226,992
char[]                                 |249            |1,233,585

5.22% heap usage reduction
In D59797#1442365, @rnk wrote:

I used the VS memory profiling tool and used the type view to see the most commonly allocated types. "void" just means allocations whose type wasn't determined, and a lot of it comes from vectors, specifically the new RelocTargets vector @mstorsjo added.

Since that vector isn't needed at all for x86, I guess it should be possible to skip it altogether there. It'll complicate the code a little bit, and the conditional might cost a little bit of runtime, but that'd give you back a lot as well.

riccibruno added inline comments.
lld/COFF/Chunks.h
236

I think that llvm::SmallVector<Symbol *, 0> is one pointer smaller, with the drawback that it is limited to 2^32-1 elements.

rnk updated this revision to Diff 192223.Mar 25 2019, 4:32 PM
  • replace struct with current size
Herald added a project: Restricted Project. · View Herald TranscriptMar 25 2019, 4:32 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
rnk marked 4 inline comments as done.Mar 25 2019, 4:39 PM
rnk added inline comments.
lld/COFF/Chunks.cpp
47

Fair enough. Honestly, writing down all the fields in one place helped me identify the profitable reorderings, but we don't need to commit it.

lld/COFF/Chunks.h
232

I'm not sure we need to store this, actually, it's pretty easy to recompute on demand. At the very least, we could shorten the size to 32-bits. I think ~4 billion is a reasonable implementation limit on relocations.

236

My plan is to see if I can eliminate it completely, but I wanted to treat it separately once I established that the size of this was important and put a cap on it.

259

This, for example, is typically 2-3 elements long: each comdat .text section will have a an assoc pdata and xdata. A singly linked list could suffice instead.

pcc added a subscriber: pcc.Mar 25 2019, 4:45 PM
pcc added inline comments.
lld/COFF/Chunks.cpp
47

I found D59044 (and, to a lesser extent, D59269) by staring at the output of clang with -Xclang -fdump-record-layouts. Maybe it would be worth adding this trick to the documentation somewhere?

smeenai added inline comments.
lld/COFF/Chunks.cpp
47

The clang static analyzer has a padding checker (https://clang.llvm.org/docs/analyzer/checkers.html#optin-performance-padding), and clang-tools-extra has clang-reorder-fields to automatically reorder fields according to the padding checker's findings. I haven't used those myself, but a coworker had success with them in the past.

rnk updated this revision to Diff 192752.Mar 28 2019, 5:43 PM
  • Replace std::vector<> with singly linked list
rnk added a comment.Mar 28 2019, 5:48 PM

I added one last size optimization, replacing a std::vector with a forward linked list and a custom iterator for it. PTAL

rnk added a comment.Apr 2 2019, 11:40 AM

I think I've addressed all the comments, and I know that @ruiu is currently travelling or recovering from travel to Japan, so I'll go ahead and land this. Feel free to provide post-commit feedback.

rnk retitled this revision from [COFF] Reorder fields in Chunk and SectionChunk to reduce their size to [COFF] Reduce the size of Chunk and SectionChunk, NFC.Apr 2 2019, 3:08 PM
rnk edited the summary of this revision. (Show Details)
rnk updated this revision to Diff 193371.Apr 2 2019, 3:10 PM
  • last update
This revision was not accepted when it landed; it landed in state Needs Review.Apr 2 2019, 3:12 PM
This revision was automatically updated to reflect the committed changes.
MaskRay added a subscriber: MaskRay.Apr 2 2019, 6:17 PM

this reduces the sum of heap allocations ... These numbers exclude memory mapped files

May I ask how you counted the memory usage minus memory mapped files?

rnk added a comment.Apr 3 2019, 12:52 PM

this reduces the sum of heap allocations ... These numbers exclude memory mapped files

May I ask how you counted the memory usage minus memory mapped files?

I'm using the Visual Studio memory profiler tool, which I believe just tracks heap allocations. It doesn't look at virtual memory or file mappings at all, it just sums the total size of all active heap allocations. So, it's not a complete picture of memory usage, but it represents the memory that we have the most control over.