This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Do not ICF two sections with different output sections when using linker scripts
AbandonedPublic

Authored by grimar on Nov 12 2018, 5:17 AM.

Details

Summary

This is https://bugs.llvm.org//show_bug.cgi?id=39418.

Currently, when LLD do ICF it checks if the output section name is the same,
but that works only for no linker script case.
We create output sections and assign input sections much later.
The patch adds logic to predict the output sections earlier, so that
we can ICF in a more correct way without complicated changes to linker design.

I used the test case provided on the PR page. Thanks, Andrew Ng :)

Diff Detail

Event Timeline

grimar created this revision.Nov 12 2018, 5:17 AM
grimar edited the summary of this revision. (Show Details)Nov 12 2018, 5:19 AM
grimar edited the summary of this revision. (Show Details)
ruiu added a comment.Nov 12 2018, 10:39 PM

I don't think this is necessarily a bug. At least, "predicating" the name of an output section does not seems a good idea to me. It is getting too tricky, and I don't like to add more complexity here. I generally do not encourage users use linker scripts as it makes linking slower and trickier, and it is to me an acceptable consequence that ICF folds input sections before linker scripts bin input sections to output sections.

I don't think this is necessarily a bug. At least, "predicating" the name of an output section does not seems a good idea to me. It is getting too tricky, and I don't like to add more complexity here. I generally do not encourage users use linker scripts as it makes linking slower and trickier, and it is to me an acceptable consequence that ICF folds input sections before linker scripts bin input sections to output sections.

I agree that prediction seems like the wrong approach, unless we can guarantee 100% accuracy (prediction implies that it isn't always accurate). However, @ruiu, whilst you may not encourage users to use linker scripts, they are widely used, and in some instances, potentially even many, there is no reasonable alternative. ICF folding input sections between output sections can result in invalid output, especially if those output sections are supposed to be in different program segments, which could result in runtime crashes. I therefore would consider any such incorrect assignment a bug in LLD.

I agree with James here. I strongly suspect that in systems where merging content is a problem such as embedded systems with overlays or only a subset of memory available for booting there may be little correlation between the input section names chosen by the compiler and that given to the output section.

The only thing I can think of right now that doesn't involve an early assignment of input sections to output sections is to exploit the Repl field. When assigning InputSections to OutputSections then try and match the non-live sections against the InputSection Descriptions. If matches a different OutputSection to the InputSection it was folded into then mark it live and assign it to an OutputSection.

As an aside the approach outlined in https://llvm.org/devmtg/2017-10/slides/LTOLinkerScriptsEdlerVonKoch.pdf seems to favour an early assignment of InputSections to OutputSections I've not seen much movement on getting that upstream since the RFC at http://lists.llvm.org/pipermail/llvm-dev/2018-May/123252.html though.

grimar abandoned this revision.Aug 26 2019, 1:36 AM

Abandoning basing on comments (+another patch was posted instead).