Page MenuHomePhabricator

[lld][ELF] Ignore __start/__stop symbols in section groups during GC
Needs ReviewPublic

Authored by phosek on Mar 23 2020, 9:33 PM.

Details

Reviewers
ruiu
espindola
Summary

The GC algorithm implementation in lld marks all sections for which
there are start_<section> and stop_<section> references as live.
This leads to an issue where the section with the same name could be
in multiple groups, and while only some of these groups are referenced,
the current implementation will keep all of them.

Concrete example of this issue is LLVM profile instrumentation where for
every function symbol we create a section (assuming -ffunction-sections)
and corresponding llvm_prf_data section that contains associated
metadata, both of which are in a group. When there aren't any references
to the function section, we'd expect the entire group to be collected,
but that won't be the case because of
start___llvm_prf_data and
stop_llvm_prf_data symbol references in profile runtime.

This patch changes the behavior to ignore start/stop symbols in section
groups which allows for these groups to be collected. This also matches
the behavior of BFD linker.

Diff Detail

Event Timeline

phosek created this revision.Mar 23 2020, 9:33 PM
ruiu added a comment.Mar 23 2020, 11:33 PM

Although I understand your concrete example, I'm not sure if this is logically correct. This patch is written based on the following assumption:

  1. Data sections (e.g. _llvm_prf_data) in a section group contain auxiliary data for a function in the same section
  2. If the function gets linked, we need the auxiliary data. Otherwise, we don't.

It looks like the above assumption is too specific to one use case and might have some unexpected side effects.

That being said, section garbage collection isn't defined by the spec and we can do whatever we think reasonable. If this is what GNU linkers are already doing, doing the same thing shouldn't cause too much trouble. Could you point out the code location where GNU linker does this process?

This change changes the rL297049 logic to not apply on sections within a group. My currently feeling is that rL297049 is too conservative. We should check what will break if we drop rL297049, and only special case sections which absolutely need such retaining behavior.

See D76410 for the most recent discussion on __start_$sectionname/__stop_$sectionname symbols.

The discussion and the tests should distinguish the COMDAT semantics from the --gc-sections semantics. They are related but distinct.

COMDAT is a standard ELF feature. It's clear how it's supposed to behave on its own: only one definition survives. Anything that prevents two input COMDAT groups with the same signature symbol from being reduced to exactly one of them in the output is clearly a bug.

The __start_*/__stop_* behavior is not formally specified. I was involved with its original creation, so I can speak with some authority on the spirit of its intent. It was never intended to influence what section contents go into the link, merely to provide symbols for the section bounds after all other logic has done so. IIRC this feature predated --gc-sections. It also predated the common use of COMDAT for C++. So its interactions with them were never really considered at the time. But I think the natural expectation of what section group semantics mean with these features is clear enough: a group always travels together. So when --gc-sections decides that .text.f is unreferenced then there is nothing keeping any of the f group in the link, so it all goes. The __start_ and __stop_ references don't affect this--they only resolve to whatever output section is left with that name after all the other logic such as COMDAT rules and --gc-sections has been applied.

The discussion and the tests should distinguish the COMDAT semantics from the --gc-sections semantics. They are related but distinct.

COMDAT is a standard ELF feature. It's clear how it's supposed to behave on its own: only one definition survives. Anything that prevents two input COMDAT groups with the same signature symbol from being reduced to exactly one of them in the output is clearly a bug.

The __start_*/__stop_* behavior is not formally specified. I was involved with its original creation, so I can speak with some authority on the spirit of its intent. It was never intended to influence what section contents go into the link, merely to provide symbols for the section bounds after all other logic has done so. IIRC this feature predated --gc-sections. It also predated the common use of COMDAT for C++. So its interactions with them were never really considered at the time. But I think the natural expectation of what section group semantics mean with these features is clear enough: a group always travels together. So when --gc-sections decides that .text.f is unreferenced then there is nothing keeping any of the f group in the link, so it all goes. The __start_ and __stop_ references don't affect this--they only resolve to whatever output section is left with that name after all the other logic such as COMDAT rules and --gc-sections has been applied.

Thanks for the context, I agree this makes sense, but it differs from the logic currently implemented by both lld and based on my experiments also BFD ld and gold, so while I agree that it's the semantics that makes sense, it's a question whether we want to diverge from the current behavior which might potentially break existing code.

ruiu added a comment.Mar 24 2020, 10:49 PM

lld's --gc-sections behavior didn't take start/stop symbols into consideration, but it turned out some programs didn't work with that algorithm, so we (I believe rafael or pcc) added the logic to preserve sections for start/stop symbols. Therefore, I think we can't ignore start/stop symbols at the section gc stage and need to at least keep the existing logic.

MaskRay added a subscriber: Bdragon28.EditedMar 24 2020, 11:07 PM

lld's --gc-sections behavior didn't take start/stop symbols into consideration, but it turned out some programs didn't work with that algorithm, so we (I believe rafael or pcc) added the logic to preserve sections for start/stop symbols. Therefore, I think we can't ignore start/stop symbols at the section gc stage and need to at least keep the existing logic.

The description of rL297049 mentioned FreeBSD kernel. It is probably about linker sets. @Bdragon28 may provide more context.

The discussion and the tests should distinguish the COMDAT semantics from the --gc-sections semantics. They are related but distinct.

COMDAT is a standard ELF feature. It's clear how it's supposed to behave on its own: only one definition survives. Anything that prevents two input COMDAT groups with the same signature symbol from being reduced to exactly one of them in the output is clearly a bug.

The __start_*/__stop_* behavior is not formally specified. I was involved with its original creation, so I can speak with some authority on the spirit of its intent. It was never intended to influence what section contents go into the link, merely to provide symbols for the section bounds after all other logic has done so. IIRC this feature predated --gc-sections. It also predated the common use of COMDAT for C++. So its interactions with them were never really considered at the time. But I think the natural expectation of what section group semantics mean with these features is clear enough: a group always travels together. So when --gc-sections decides that .text.f is unreferenced then there is nothing keeping any of the f group in the link, so it all goes. The __start_ and __stop_ references don't affect this--they only resolve to whatever output section is left with that name after all the other logic such as COMDAT rules and --gc-sections has been applied.

Thanks for the context, I agree this makes sense, but it differs from the logic currently implemented by both lld and based on my experiments also BFD ld and gold, so while I agree that it's the semantics that makes sense, it's a question whether we want to diverge from the current behavior which might potentially break existing code.

If we have reasonable GC behavior. We can ask GNU ld and gold to adopt. I managed to do this once https://sourceware.org/ml/binutils/2020-02/msg00035.html