This is an archive of the discontinued LLVM Phabricator instance.

ELF2: Implement --gc-sections.
ClosedPublic

Authored by ruiu on Oct 21 2015, 10:30 AM.

Details

Reviewers
rafael
Summary

Section garbage collection is a feature to remove unused sections
from outputs. Unused sections are sections that cannot be reachable
from known GC-root symbols or sections. Naturally the feature is
implemented as a mark-sweep garbage collector.

In this patch, I added Live bit to InputSectionBase. If and only
if Live bit is on, the section will be written to the output.
Starting from GC-root symbols or sections, a new function, markLive(),
visits all reachable sections and sets their Live bits. Writer then
ignores sections whose Live bit is off, so that such sections are
excluded from the output.

This change has small negative impact on performance if you use
the feature. The time to link Clang changes from 0.356s to 0.386s,
or +8%.

It reduces Clang size from 57,764,984 bytes to 55,296,600 bytes.
That is 4.3% reduction.

Diff Detail

Event Timeline

ruiu updated this revision to Diff 38027.Oct 21 2015, 10:30 AM
ruiu retitled this revision from to ELF2: Implement --gc-sections. .
ruiu updated this object.
ruiu added a reviewer: rafael.
ruiu updated this object.
ruiu added a subscriber: llvm-commits.
ruiu updated this revision to Diff 38029.Oct 21 2015, 10:36 AM

Remove debug code.

rafael edited edge metadata.Oct 21 2015, 2:33 PM
rafael added a subscriber: rafael.

Just applied the patch locally to review it. There are a few
formatting issues. Please git-clang-format.

I am also getting a test failure:

Command Output (stderr):

<stdin>:178:2: error: GC-NOT: string occurred!
Name: x (3)
^
/home/espindola/llvm/lld/test/elf2/gc-sections.s:24:11: note: GC-NOT:
pattern specified here

  1. GC-NOT: Name: x
ruiu updated this revision to Diff 38052.Oct 21 2015, 2:43 PM
ruiu edited edge metadata.
  • Fix formatting
  • Fix test
rafael added inline comments.Oct 21 2015, 3:12 PM
ELF/MarkLive.cpp
68

Can you check anything other than the name? The section type or flags maybe?

96

]You only need to do this if you are producing a shared library or --export-dynamic is used.

ELF/Writer.cpp
425

Can you check for "this == &InputSection<ELFT>::Discarded" in isLive?

ruiu added inline comments.Oct 21 2015, 3:18 PM
ELF/MarkLive.cpp
68

This list may be limited but I don't know what other sections need to be preserved.

gold seems to preserve more sections (https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=gold/object.cc;h=76d4630d16a8cfbe445f421250798c3438aa48e4;hb=HEAD#l353). I didn't copy that as I don't know the rational for some sections listed there.

96

Externally visible symbols might be used later at runtime to interpose symbols provided by DSOs, no?

ruiu added inline comments.Oct 21 2015, 3:30 PM
ELF/Writer.cpp
425

We can, but I'm not sure if that's the right thing to do. Section merging and garbage collection are basically orthogonal, so hiding one thing behind the other's predicate may be confusing.

ruiu added a comment.Oct 21 2015, 3:34 PM

By the way, do you think there's a better way to visit all successors than the way I did in MarkLive? I think the code is not bad (forEachSuccessor and doForEachSuccessor are about 40 lines in total), but I feel that the code is a bit awkward. Is there any way to iterate over both REL and RELA relocations without using templates? Is there a better way to visit both non-local and local symbols?

joerg added a subscriber: joerg.Oct 21 2015, 3:34 PM
joerg added inline comments.
ELF/MarkLive.cpp
68

Note sections are rather obvious, you shouldn't be getting a working binary on many platforms otherwise. Exceptions and JCR are similar. Preinit is a variant of the existing init theme.

I'm not sure why personality functions and nptl_version are special cased. The latter sounds like a glibc specific hack and personality functions should normally be referenced from the exception table. ARM issue perhaps?

Can you check anything other than the name? The section type or flags maybe?

This list may be limited but I don't know what other sections need to be preserved.

I don't mean to preserve less sections. Just how we check them. Of your list:

  • .ctors, .dtors: I think we have to use the name.
  • .fini: Check SHT_INIT_ARRAY
  • .init: Check SHT_FINI_ARRAY
  • .note: Check SHT_NOTE

You also need a check for SHT_PREINIT_ARRAY.

Externally visible symbols might be used later at runtime to interpose symbols provided by DSOs, no?

You can't interpose if it is already in the main binary, and if a
shared library wants symbols from the binary, the binary must be
linked with --export-dynamic.

Cheers,
Rafael

ruiu updated this revision to Diff 38062.Oct 21 2015, 4:16 PM

USE sh_type instead of section name.

rafael accepted this revision.Oct 21 2015, 7:29 PM
rafael edited edge metadata.

LGTM with the "Preserve externally-visible symbols." part changed to only do it for -shared and -export-dynamic.

This revision is now accepted and ready to land.Oct 21 2015, 7:29 PM
Eugene.Zelenko added a subscriber: Eugene.Zelenko.

Committed in r251043.