When a note is in a group and the other elements of the group are discarded, also discard the note. This is needed both to respect the group semantic and for the annobin tool (see https://developers.redhat.com/blog/2018/02/20/annobin-storing-information-binaries/).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I stll need to do some extra validation, but the patch is in a decent state and I'll happily take feedbakc. @psmith I followed the discussion we had at FOSDEM back in Februrary and limited the scope of the patch to SHT_NOTE.
Some additional conetxt. IIRC we originally discussed this as part of D56437 . The specific part of annobin was the feature:
"attach""no-attach" When gcc compiles code with the -ffunction-sections option active it will place each function into its own section. When the annobin attach option is active the plugin will attempt to attach the function section to a group containing the notes and relocations for the function. In that way, if the linker decides to discard the function, it will also know that it should discard the notes and relocations as well. The default is to enable attach, but the inverse option is available in case the host assembler does not support the .attach_to_group pseudo-op. If this feature is disabled then note generation for function sections will not work properly.
In D56437 there were some objections (https://reviews.llvm.org/D56437#1359385) to treating members of a COMDAT group in a keep-all/discard-all way. The non-gc impacting part of D56437 was deferred to later. I guess this is it.
I don't have any objections. Adding some more reviewer that were involved in D56437.
lld/ELF/InputFiles.cpp | ||
---|---|---|
552 | At this point we don't know whether the group has been accepted or not. Would it be possible to move this into the braces of the following (line 568) if (isNew) { if (config->relocatable) this->sections[i] = createInputSection(sec); continue; } Possibly renaming it to selectedGroups? Given that a large number of COMDAT groups will be discarded in a large C++ program it is probably worth only scanning selected ones for SHT_NOTES sections. | |
618 | I'd word the comment to mention why we are deviating from the specification here. Perhaps something like: | |
632 | I think you can skip this loop if notes is empty. The majority of groups won't have any SHT_NOTES sections. |
Better support for group semantic wrt. notes
This is needed both to respect the group semantic and for the annobin tool (see
"to respect the group semantic" is not accurate. The ELF specification does not have such rules. I think this is more of a GNU ld --gc-sections semantic (not even implemented in gold). It seems that when GNU ld decides to discard a SHF_ALLOC section in a comdat group, it discards non-SHF_ALLOC sections in the group as well. The section type does not seem to matter. If we rename .note..text.foo (is double-dot .. the intention?) to .aaa, and change its type from SHT_NOTE to SHT_PROGBITS, we will get the same observations. Knowing the rules used by GNU ld will be helpful.
lld/test/ELF/sht-group-note.test | ||
---|---|---|
1 | A file-level comment that describes the purpose. Perhaps rename the test to gc-sections-group.s | |
34 | Add Content: "00" GNU ld discards empty sections. A non-empty .text.foo makes it easy to compare the linkers' behaviors. | |
37 | .rela.text.foo is unrelated and can be deleted. |
lld/ELF/InputFiles.cpp | ||
---|---|---|
509 | Maybe we should just use std::vector. When an object file contains section groups, there can be many. 4 is probably not better than 0. The code size increase is probably unnecessary. | |
638 | I think note->dependentSections only carries one-bit information: whether it is in a group or not. How many sections and what sections it depends on does not matter. |
"to respect the group semantic" is not accurate. The ELF specification does not have such rules. I think this is more of a GNU ld --gc-sections semantic (not even implemented in gold). It seems that when GNU ld decides to discard a SHF_ALLOC section in a comdat group, it discards non-SHF_ALLOC sections in the group as well. The section type does not seem to matter. If we rename .note..text.foo (is double-dot .. the intention?) to .aaa, and change its type from SHT_NOTE to SHT_PROGBITS, we will get the same observations. Knowing the rules used by GNU ld will be helpful.
I was basing my comment on https://docs.oracle.com/cd/E19120-01/open.solaris/819-0690/chapter7-26/index.html
There can be internal references among group sections. However, these references make no sense if one of the sections were removed, or one of the sections were replaced by a duplicate from another object. Therefore, these groups are included, or these groups are omitted, from the linked object as a unit.
I think this approach may have a problem when we consider /DISCARD/ in a linker script. I'll propose an alternative fix which I believe will improve the case and improve the compatibility with GNU ld, in the hope that you won't mind :)
lld/ELF/MarkLive.cpp | ||
---|---|---|
362 | This is not specific to SHT_NOTE. Delete sec->type == SHT_NOTE && | |
lld/test/ELF/sht-group-note.test | ||
1 | I still think a test named gc-sections-*.s will be more appropriate. | |
52 | (Aside: is double-dot .. in .note..text.foo intentional?) Add at least 3 members to the group: SHT_NOTE (non-SHF_ALLOC), SHT_PROGBITS, SHT_NOBITS. Attach a symbol to each section. Add a test case for each symbol, that checks -e sym can retain all 3 sections. |
I think this approach may have a problem when we consider /DISCARD/ in a linker script. I'll propose an alternative fix which I believe will improve the case and improve the compatibility with GNU ld, in the hope that you won't mind :)
The following patch is a frail tentative at supporting /DISCARD/ in linker script:
diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp index cebbd89..1d5c8d7 100644 --- a/lld/ELF/LinkerScript.cpp +++ b/lld/ELF/LinkerScript.cpp @@ -442,6 +442,9 @@ LinkerScript::computeInputSections(const InputSectionDescription *cmd) { } void LinkerScript::discard(InputSectionBase *s) { + if(!s->isLive()) + return; + if (s == in.shStrTab || s == mainPart->relaDyn || s == mainPart->relrDyn) error("discarding " + s->name + " section is not allowed"); diff --git a/lld/test/ELF/linkerscript/discard-section-group.test b/lld/test/ELF/linkerscript/discard-section-group.test new file mode 100644 index 0000000..e5b090f --- /dev/null +++ b/lld/test/ELF/linkerscript/discard-section-group.test @@ -0,0 +1,47 @@ +# REQUIRES: x86 +# RUN: yaml2obj %s -o %t.o +# +# RUN: echo "SECTIONS { /DISCARD/ : { *(.text.foo*) } }" > %t0.script +# RUN: ld.lld -o %t0 --script %t0.script %t.o +# RUN: llvm-readobj -S %t0 | FileCheck %s +# +# RUN: echo "SECTIONS { /DISCARD/ : { *(.text.foo*) } /DISCARD/ : { *(.note*) } }" > %t1.script +# RUN: ld.lld -o %t1 --script %t1.script %t.o +# RUN: llvm-readobj -S %t1 | FileCheck %s +# +# RUN: echo "SECTIONS { /DISCARD/ : { *(.note*) } }" > %t2.script +# RUN: ld.lld -o %t2 --script %t2.script %t.o +# RUN: llvm-readobj -S %t2 | FileCheck %s + +# CHECK-NOT: .text +# CHECK-NOT: .note + +--- !ELF +FileHeader: + Class: ELFCLASS64 + Data: ELFDATA2LSB + Type: ET_REL + Machine: EM_X86_64 +Sections: + - Name: .group + Type: SHT_GROUP + Link: .symtab + Info: foo + Members: + - SectionOrType: GRP_COMDAT + - SectionOrType: .text.foo + - SectionOrType: .note.text.foo + - Name: .note.text.foo + Type: SHT_NOTE + Flags: [ SHF_GROUP ] + Content: "DEAD" + - Name: .text.foo + Type: SHT_PROGBITS + Flags: [ SHF_ALLOC, SHF_EXECINSTR, SHF_GROUP ] + Content: "00" +Symbols: + - Name: foo + Binding: STB_GLOBAL + Type: STT_FUNC + Section: .text.foo + Size: 0x08
So at first glance it looks possible to apply a patch on top of this one to support /DISCARD/, and I'd be happier if we can do it that way rather than discard this review.
That being said, if you're alternative fix improves the case, let's put ego aside and go that way :-)
See D70146:)
The pasted patch changes the test linkerscript/discard-group.s from an infinite recursion to another unexpected behavior (dropping .text for /DISCARD/ : { *(.note*) }). I think we need a new field to express the relationship. In GNU ld, bfd_elf_section_data::next_in_group tracks such information and is used by its garbage collector.
Thanks for the alternative patch! Indeed using a new field avoids interacting with dependentSections and makes it possible to cleanly match ld's behavior.
I'm surprised by ld's behavior wrt. discarding a group member. Based on the semantic of a group, I would have assumed that it either raises an error, or discards the whole group, or raise a warning and keep the section if the whole group is not discarded. Allowing to remove a section from the group seems strange to me. Quoting https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter7-26.html
However, these references make no sense if one of the sections were removed
I'll try to find more info on that aspect.
Maybe we should just use std::vector. When an object file contains section groups, there can be many. 4 is probably not better than 0. The code size increase is probably unnecessary.