This is an archive of the discontinued LLVM Phabricator instance.

[lld] Better support for group semantic wrt. notes
AbandonedPublic

Authored by serge-sans-paille on Nov 8 2019, 10:29 AM.

Details

Summary

When a note is in a group and the other elements of the group are discarded, also discard the note. This is needed both to respect the group semantic and for the annobin tool (see https://developers.redhat.com/blog/2018/02/20/annobin-storing-information-binaries/).

Diff Detail

Event Timeline

I stll need to do some extra validation, but the patch is in a decent state and I'll happily take feedbakc. @psmith I followed the discussion we had at FOSDEM back in Februrary and limited the scope of the patch to SHT_NOTE.

peter.smith added a subscriber: peter.smith.

Some additional conetxt. IIRC we originally discussed this as part of D56437 . The specific part of annobin was the feature:

"attach""no-attach"
When gcc compiles code with the -ffunction-sections option active it will place each function into its own section.  When the annobin attach option is active the plugin will attempt to attach the function section to a group containing the notes and relocations for the function.  In that way, if the linker decides to discard the function, it will also know that it should discard the notes and relocations as well.

The default is to enable attach, but the inverse option is available in case the host assembler does not support the .attach_to_group pseudo-op.  If this feature is disabled then note generation for function sections will not work properly.

In D56437 there were some objections (https://reviews.llvm.org/D56437#1359385) to treating members of a COMDAT group in a keep-all/discard-all way. The non-gc impacting part of D56437 was deferred to later. I guess this is it.

I don't have any objections. Adding some more reviewer that were involved in D56437.

lld/ELF/InputFiles.cpp
552

At this point we don't know whether the group has been accepted or not. Would it be possible to move this into the braces of the following (line 568)

if (isNew) {
  if (config->relocatable)
    this->sections[i] = createInputSection(sec);
  continue;
}

Possibly renaming it to selectedGroups?

Given that a large number of COMDAT groups will be discarded in a large C++ program it is probably worth only scanning selected ones for SHT_NOTES sections.

618

I'd word the comment to mention why we are deviating from the specification here. Perhaps something like:
The ELF specification states "If the linker decides to remove the section group, it must remove all members of the group." The annobin tool places SHT_NOTES sections into groups, relying on the linker garbage collection to remove them if the group is removed. As LLD does not garbage collect non SHF_ALLOC sections we make the notes dependent on the non-SHT_NOTE group sections so that we can garbage collect them.

632

I think you can skip this loop if notes is empty. The majority of groups won't have any SHT_NOTES sections.

MaskRay added a comment.EditedNov 11 2019, 5:34 PM

Better support for group semantic wrt. notes

This is needed both to respect the group semantic and for the annobin tool (see

"to respect the group semantic" is not accurate. The ELF specification does not have such rules. I think this is more of a GNU ld --gc-sections semantic (not even implemented in gold). It seems that when GNU ld decides to discard a SHF_ALLOC section in a comdat group, it discards non-SHF_ALLOC sections in the group as well. The section type does not seem to matter. If we rename .note..text.foo (is double-dot .. the intention?) to .aaa, and change its type from SHT_NOTE to SHT_PROGBITS, we will get the same observations. Knowing the rules used by GNU ld will be helpful.

lld/test/ELF/sht-group-note.test
1 ↗(On Diff #228493)

A file-level comment that describes the purpose.

Perhaps rename the test to gc-sections-group.s

34 ↗(On Diff #228493)

Add Content: "00"

GNU ld discards empty sections. A non-empty .text.foo makes it easy to compare the linkers' behaviors.

37 ↗(On Diff #228493)

.rela.text.foo is unrelated and can be deleted.

MaskRay added inline comments.Nov 11 2019, 5:43 PM
lld/ELF/InputFiles.cpp
509

Maybe we should just use std::vector. When an object file contains section groups, there can be many. 4 is probably not better than 0. The code size increase is probably unnecessary.

638

I think note->dependentSections only carries one-bit information: whether it is in a group or not. How many sections and what sections it depends on does not matter.

"to respect the group semantic" is not accurate. The ELF specification does not have such rules. I think this is more of a GNU ld --gc-sections semantic (not even implemented in gold). It seems that when GNU ld decides to discard a SHF_ALLOC section in a comdat group, it discards non-SHF_ALLOC sections in the group as well. The section type does not seem to matter. If we rename .note..text.foo (is double-dot .. the intention?) to .aaa, and change its type from SHT_NOTE to SHT_PROGBITS, we will get the same observations. Knowing the rules used by GNU ld will be helpful.

I was basing my comment on https://docs.oracle.com/cd/E19120-01/open.solaris/819-0690/chapter7-26/index.html

There can be internal references among group sections. However, these references make no sense if one of the sections were removed, or one of the sections were replaced by a duplicate from another object. Therefore, these groups are included, or these groups are omitted, from the linked object as a unit.

serge-sans-paille marked 7 inline comments as done.

Thanks @psmith and @MaskRay for the comments, most comments have been taken into account. Most significantly, I updated the section dependencies for a cleaner and less memory-intensive version.

Thanks for the updates, I've no more comments and have no objections.

I think this approach may have a problem when we consider /DISCARD/ in a linker script. I'll propose an alternative fix which I believe will improve the case and improve the compatibility with GNU ld, in the hope that you won't mind :)

lld/ELF/MarkLive.cpp
362

This is not specific to SHT_NOTE. Delete sec->type == SHT_NOTE &&

lld/test/ELF/sht-group-note.test
51 ↗(On Diff #228835)

(Aside: is double-dot .. in .note..text.foo intentional?)

Add at least 3 members to the group: SHT_NOTE (non-SHF_ALLOC), SHT_PROGBITS, SHT_NOBITS. Attach a symbol to each section. Add a test case for each symbol, that checks -e sym can retain all 3 sections.

1 ↗(On Diff #228493)

I still think a test named gc-sections-*.s will be more appropriate.

serge-sans-paille marked 3 inline comments as done.

I think this approach may have a problem when we consider /DISCARD/ in a linker script. I'll propose an alternative fix which I believe will improve the case and improve the compatibility with GNU ld, in the hope that you won't mind :)

The following patch is a frail tentative at supporting /DISCARD/ in linker script:

diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index cebbd89..1d5c8d7 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -442,6 +442,9 @@ LinkerScript::computeInputSections(const InputSectionDescription *cmd) {
 }
 
 void LinkerScript::discard(InputSectionBase *s) {
+  if(!s->isLive())
+    return;
+
   if (s == in.shStrTab || s == mainPart->relaDyn || s == mainPart->relrDyn)
     error("discarding " + s->name + " section is not allowed");
 
diff --git a/lld/test/ELF/linkerscript/discard-section-group.test b/lld/test/ELF/linkerscript/discard-section-group.test
new file mode 100644
index 0000000..e5b090f
--- /dev/null
+++ b/lld/test/ELF/linkerscript/discard-section-group.test
@@ -0,0 +1,47 @@
+# REQUIRES: x86
+# RUN: yaml2obj %s -o %t.o
+#
+# RUN: echo "SECTIONS { /DISCARD/ : { *(.text.foo*) } }" > %t0.script
+# RUN: ld.lld -o %t0 --script %t0.script %t.o
+# RUN: llvm-readobj -S %t0 | FileCheck %s
+#
+# RUN: echo "SECTIONS { /DISCARD/ : { *(.text.foo*) } /DISCARD/ : { *(.note*) } }" > %t1.script
+# RUN: ld.lld -o %t1 --script %t1.script %t.o
+# RUN: llvm-readobj -S %t1 | FileCheck %s
+#
+# RUN: echo "SECTIONS { /DISCARD/ : { *(.note*) } }" > %t2.script
+# RUN: ld.lld -o %t2 --script %t2.script %t.o
+# RUN: llvm-readobj -S %t2 | FileCheck %s
+
+# CHECK-NOT: .text
+# CHECK-NOT: .note
+
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  Type:            ET_REL
+  Machine:         EM_X86_64
+Sections:
+  - Name:            .group
+    Type:            SHT_GROUP
+    Link:            .symtab
+    Info:            foo
+    Members:
+      - SectionOrType:    GRP_COMDAT
+      - SectionOrType:    .text.foo
+      - SectionOrType:    .note.text.foo
+  - Name:            .note.text.foo
+    Type:            SHT_NOTE
+    Flags:           [ SHF_GROUP ]
+    Content:         "DEAD"
+  - Name:            .text.foo
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR, SHF_GROUP ]
+    Content:         "00"
+Symbols:
+  - Name:            foo
+    Binding:         STB_GLOBAL
+    Type:            STT_FUNC
+    Section:         .text.foo
+    Size:            0x08

So at first glance it looks possible to apply a patch on top of this one to support /DISCARD/, and I'd be happier if we can do it that way rather than discard this review.

That being said, if you're alternative fix improves the case, let's put ego aside and go that way :-)

I think this approach may have a problem when we consider /DISCARD/ in a linker script. I'll propose an alternative fix which I believe will improve the case and improve the compatibility with GNU ld, in the hope that you won't mind :)

The following patch is a frail tentative at supporting /DISCARD/ in linker script:

diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index cebbd89..1d5c8d7 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -442,6 +442,9 @@ LinkerScript::computeInputSections(const InputSectionDescription *cmd) {
 }
 
 void LinkerScript::discard(InputSectionBase *s) {
+  if(!s->isLive())
+    return;
+
   if (s == in.shStrTab || s == mainPart->relaDyn || s == mainPart->relrDyn)
     error("discarding " + s->name + " section is not allowed");
 
diff --git a/lld/test/ELF/linkerscript/discard-section-group.test b/lld/test/ELF/linkerscript/discard-section-group.test
new file mode 100644
index 0000000..e5b090f
--- /dev/null
+++ b/lld/test/ELF/linkerscript/discard-section-group.test
@@ -0,0 +1,47 @@
+# REQUIRES: x86
+# RUN: yaml2obj %s -o %t.o
+#
+# RUN: echo "SECTIONS { /DISCARD/ : { *(.text.foo*) } }" > %t0.script
+# RUN: ld.lld -o %t0 --script %t0.script %t.o
+# RUN: llvm-readobj -S %t0 | FileCheck %s
+#
+# RUN: echo "SECTIONS { /DISCARD/ : { *(.text.foo*) } /DISCARD/ : { *(.note*) } }" > %t1.script
+# RUN: ld.lld -o %t1 --script %t1.script %t.o
+# RUN: llvm-readobj -S %t1 | FileCheck %s
+#
+# RUN: echo "SECTIONS { /DISCARD/ : { *(.note*) } }" > %t2.script
+# RUN: ld.lld -o %t2 --script %t2.script %t.o
+# RUN: llvm-readobj -S %t2 | FileCheck %s
+
+# CHECK-NOT: .text
+# CHECK-NOT: .note
+
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  Type:            ET_REL
+  Machine:         EM_X86_64
+Sections:
+  - Name:            .group
+    Type:            SHT_GROUP
+    Link:            .symtab
+    Info:            foo
+    Members:
+      - SectionOrType:    GRP_COMDAT
+      - SectionOrType:    .text.foo
+      - SectionOrType:    .note.text.foo
+  - Name:            .note.text.foo
+    Type:            SHT_NOTE
+    Flags:           [ SHF_GROUP ]
+    Content:         "DEAD"
+  - Name:            .text.foo
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR, SHF_GROUP ]
+    Content:         "00"
+Symbols:
+  - Name:            foo
+    Binding:         STB_GLOBAL
+    Type:            STT_FUNC
+    Section:         .text.foo
+    Size:            0x08

So at first glance it looks possible to apply a patch on top of this one to support /DISCARD/, and I'd be happier if we can do it that way rather than discard this review.

That being said, if you're alternative fix improves the case, let's put ego aside and go that way :-)

See D70146:)

The pasted patch changes the test linkerscript/discard-group.s from an infinite recursion to another unexpected behavior (dropping .text for /DISCARD/ : { *(.note*) }). I think we need a new field to express the relationship. In GNU ld, bfd_elf_section_data::next_in_group tracks such information and is used by its garbage collector.

Thanks for the alternative patch! Indeed using a new field avoids interacting with dependentSections and makes it possible to cleanly match ld's behavior.

I'm surprised by ld's behavior wrt. discarding a group member. Based on the semantic of a group, I would have assumed that it either raises an error, or discards the whole group, or raise a warning and keep the section if the whole group is not discarded. Allowing to remove a section from the group seems strange to me. Quoting https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter7-26.html

However, these references make no sense if one of the sections were removed

I'll try to find more info on that aspect.