This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Improve --gc-sections compatibility with GNU ld regarding section groups
ClosedPublic

Authored by MaskRay on Nov 12 2019, 2:34 PM.

Details

Summary

Based on D70020 by serge-sans-paille.

The ELF spec says:

Furthermore, there may be internal references among these sections that would not make sense if one of the sections were removed or replaced by a duplicate from another object. Therefore, such groups must be included or omitted from the linked object as a unit. A section cannot be a member of more than one group.

GNU ld has 2 behaviors that we don't have:

  • Group members (nextInSectionGroup != nullptr) are subject to garbage collection. This includes non-SHF_ALLOC SHT_NOTE sections. In particular, discarding non-SHF_ALLOC SHT_NOTE sections is an expected behavior by the Annobin project. See https://developers.redhat.com/blog/2018/02/20/annobin-storing-information-binaries/ for more information.
  • Groups members are retained or discarded as a unit. Members may have internal references that are not expressed as SHF_LINK_ORDER, relocations, etc. It seems that we should be more conservative here: if a section is marked live, mark all the other member within the group.

Both behaviors are reasonable. This patch implements them.

A new field InputSectionBase::nextInSectionGroup tracks the next member
within a group. on ELF64, this increases sizeof(InputSectionBase) froms
144 to 152.

InputSectionBase::dependentSections tracks section dependencies, which
is used by both --gc-sections and /DISCARD/. We can't overload it for
the "next member" semantic, because we should allow /DISCARD/ to discard
sections independent of --gc-sections (GNU ld behavior). This behavior
may be reasonably used by /DISCARD/ : { *(.ARM.exidx*) } or `/DISCARD/
: { *(.note*) } (new test linkerscript/discard-group.s`).

Diff Detail

Event Timeline

MaskRay created this revision.Nov 12 2019, 2:34 PM
MaskRay updated this revision to Diff 228969.Nov 12 2019, 2:46 PM
MaskRay edited the summary of this revision. (Show Details)

Improve the description and a comment.

LGTM. The only thing that has me worried is to have two fields (dependentSections and nextInGroup) to represent concepts that are close (but different, due to that strange behavior wrt. /DISCARD/ sections).

I'm broadly in favour of making this change. I recognise there are some concerns about the intent behind the ELF specification, which is largely written without garbage collection in mind. One reading of it is that garbage collection is orthogonal to group selection, and the all or nothing comments relate to group selection only. Will be good to wait a few days to see if there are any strong objections. I tend towards the conservative approach unless there are some demonstrable benefits from not doing so.

For upstream clang compiled objects I don't think that there would be a large impact for this change as the majority of groups have just one Section and a Relocation Section. ARM compiled groups also have .ARM.exidx and accompanying relocations but these are already all or nothing due to the way they are constructed, nothing refers to the .ARM.exidx and we rely on the SHF_LINK_ORDER reverse dependency for garbage collection protection.

I'm happy to separate out nextInGroup and dependentSections as while there are some similarities the former is used for ELF groups the latter is used for SHF_LINK_ORDER and keeping the concepts separate is reasonable.

Cool. I've started this thread https://sourceware.org/ml/binutils/2019-11/msg00187.html to gather more opinions about the reading of the specs.

FTR, I also think I have no objections. A few minor nits are inline.

lld/ELF/InputFiles.cpp
621

nit: I so not think that we declare more than one variable on a line in the LLD code.
Personally I find it OK for my pet projects, but this probably violates the LLD style, so ...

623

use continue?

lld/test/ELF/linkerscript/discard-group.s
2

Why arm? Can this be x86?

MaskRay updated this revision to Diff 229346.Nov 14 2019, 10:24 AM
MaskRay marked 3 inline comments as done.

review comments

MaskRay marked 3 inline comments as done.Nov 14 2019, 10:24 AM
MaskRay added inline comments.
lld/test/ELF/linkerscript/discard-group.s
2

The intention is to create a practical use case which is more practical. It demonstrates the advantage that makes section group selection and garbage collection orthogonal for group members. https://sourceware.org/ml/binutils/2019-11/msg00210.html

.fnstart creates an associated section (.ARM.exidx*) in the section group, while .cfi_startproc creates .eh_frame outside the group.

grimar added inline comments.Nov 18 2019, 1:31 AM
lld/test/ELF/linkerscript/discard-group.s
2

Ok, I see now, thanks.

.fnstart creates an associated section (.ARM.exidx*) in the section group

Perhaps I'd mention this in the comments in this test case, for readers who are not so familar with arm (like me).

MaskRay updated this revision to Diff 229972.Nov 18 2019, 10:00 PM

Add more comments to test/ELF/linkerscript/discard-group.s

MaskRay marked an inline comment as done.Nov 18 2019, 10:00 PM
ruiu accepted this revision.Nov 18 2019, 11:07 PM

LGTM

lld/ELF/InputFiles.cpp
595

This is not your code, but could you please add a comment for this block since you add more code to this function? This block handles SHF_LINK_ORDER section flags.

lld/ELF/InputSection.h
147

I'd probably name this sectionGroupNextMember or something like that, so that it is clear that "group" means "section group".

This revision is now accepted and ready to land.Nov 18 2019, 11:07 PM
MaskRay marked an inline comment as done.Nov 18 2019, 11:32 PM
MaskRay added inline comments.
lld/ELF/InputSection.h
147

What about nextInSectionGroup?

@peter.smith What do you think?

(I confess that in BFD, a member serving a similar purpose is named next_in_group.)

MaskRay updated this revision to Diff 229981.Nov 18 2019, 11:32 PM

Add a comment

ruiu added inline comments.Nov 19 2019, 12:17 AM
lld/ELF/InputSection.h
147

That's fine too. I honestly don't think next_in_group is a good name, as we use "group" for other meanings such as --start-group.

MaskRay updated this revision to Diff 229987.Nov 19 2019, 12:29 AM
MaskRay edited the summary of this revision. (Show Details)

nextInGroup -> nextInSectionGroup

This looks good to me too.

lld/ELF/InputSection.h
147

Yes nextInSectionGroup is fine.

This revision was automatically updated to reflect the committed changes.