This is an archive of the discontinued LLVM Phabricator instance.

[ELF2] SECTIONS command basic support
ClosedPublic

Authored by denis-protivensky on Oct 28 2015, 1:31 AM.

Details

Summary

Add support of SECTIONS command:

  • determine output section by input section name
  • discard input sections
  • order output sections accordingly
  • don't add bounding symbols (__start/__stop) for sections defined in the linker script

Diff Detail

Repository
rL LLVM

Event Timeline

denis-protivensky retitled this revision from to [ELF2] SECTIONS command basic support.
denis-protivensky updated this object.
denis-protivensky added reviewers: ruiu, rafael.
denis-protivensky added a project: lld.
denis-protivensky updated this object.
denis-protivensky added a subscriber: llvm-commits.
grimar added a subscriber: grimar.Oct 28 2015, 2:12 AM
grimar added inline comments.
ELF/Writer.cpp
453 ↗(On Diff #38634)

This could be in class declaration.

495 ↗(On Diff #38634)

Does not seem that llvm_unreachable is really needed. Code just few lines above creates OutputSections or MergeOutputSections, no any other choices.

885 ↗(On Diff #38634)

This would work a bit faster in some cases:

if (Config->OutputSections.find(A->getName()) == ItEnd ||
Config->OutputSections.find(B->getName()) == ItEnd )
{...

ELF/Writer.cpp
453 ↗(On Diff #38634)

Ok.

495 ↗(On Diff #38634)

There are other choices coming from the cached values of Map. It now holds only Bss before the loop, but who knows what can be added later. That's a good form of protection.

885 ↗(On Diff #38634)

I don't think we should sacrifice readability since performance benefit is not clear here.
I need iterators after this check, and I don't want to write C-style code like:

if ((ItA = Config->.... ) == ItEnd ||
  ((ItB = Config->.... ) == ItEnd))

I will moreover need to define full iterator types before usage instead of auto deduction.

grimar added inline comments.Oct 28 2015, 3:41 AM
ELF/Writer.cpp
885 ↗(On Diff #38634)

I didnt notice at first that you use ItA/ItB after that. Ok then.

denis-protivensky marked 3 inline comments as done.Oct 28 2015, 8:10 AM
denis-protivensky added inline comments.
ELF/Writer.cpp
885 ↗(On Diff #38634)

I actually can break it up into two separate checks so it'll be more optimal and won't affect code style.

grimar added inline comments.Oct 28 2015, 8:18 AM
ELF/Writer.cpp
885 ↗(On Diff #38634)

I thought about that but for me it will look not very readable. Not sure if that possible little speed up worth that.

ruiu added inline comments.Oct 28 2015, 8:50 AM
ELF/OutputSections.h
98–99 ↗(On Diff #38634)

Why do you need this?

ELF/Writer.cpp
92–107 ↗(On Diff #38634)

No, please don't make another Writer class. This is over-engineering. There's no such need to support two writers, one for ELF without linker script and another for ELF + linker script.

denis-protivensky marked 3 inline comments as done.

Updated:

  • Remove dynamic casting of output section classes
  • Remove separate class for linker script support
  • Move linker script-related code to Writer and use conditional checks where needed
grimar added inline comments.Oct 29 2015, 3:49 AM
ELF/Config.h
38 ↗(On Diff #38716)

May be regroup would look better ?

llvm::StringRef InputFile;
std::vector<llvm::StringRef> Names;
std::vector<llvm::StringRef> ExcludeFiles;
ELF/Writer.cpp
460 ↗(On Diff #38716)

What about next one ?

 if (!IS || !IS->isLive() || IS == &InputSection<ELFT>::Discarded)
    return true;

return hasCustomSections() && InputToOutputSection.lookup(IS->getSectionName()) == "/DISCARD/";
910 ↗(On Diff #38716)

I think you want to use & after autos:
auto &OutSec, auto &InSec ...

denis-protivensky marked 3 inline comments as done.Oct 29 2015, 4:26 AM
denis-protivensky added inline comments.
ELF/Config.h
38 ↗(On Diff #38716)

Ok, maybe.

ELF/Writer.cpp
460 ↗(On Diff #38716)

A matter of taste again. Okay.

910 ↗(On Diff #38716)

Yes, that's right.

denis-protivensky marked 3 inline comments as done.

Update:

  • Style fixes
  • Use reference in range-for-loop (auto &)
ruiu added inline comments.Oct 29 2015, 6:48 AM
ELF/LinkerScript.cpp
250 ↗(On Diff #38723)

I wouldn't write this in this functional style. This can be written in a plain C style if you define skip(S) which returns true if the next token is S.

expect("(");
while (!skip(")") {
  StringRef Tok = next();
  if (Tok == "EXCLUDE_FILE") {
  ...
ELF/Writer.cpp
461–465 ↗(On Diff #38723)

Why don't you create start/stop symbols if the section is defined using a linker script? Is that a documented behavior?

614 ↗(On Diff #38723)

I don't like to add more code to createSections() which is already too long. Move this lambda out of this function.

grimar added inline comments.Oct 29 2015, 7:22 AM
ELF/Writer.cpp
909 ↗(On Diff #38723)

I think you can do the same for auto Name.

ELF/LinkerScript.cpp
250 ↗(On Diff #38723)

mapBraces actually does the same. I can change peek to skip, but why to produce code duplication where it can be avoided? Procedural style may be more cumbersome here than the current functional implementation.

ELF/Writer.cpp
461–465 ↗(On Diff #38723)

If one defines output section description, they may define all needed symbols manually.
That's why I think any default actions that can be achieved with linker script should be suppressed.

614 ↗(On Diff #38723)

I need RegularSections, which is local to the method. It's captured in the lambda, and I think it would be ugly to pass it to the comparison function along with two output sections to compare. Any ideas how to make it better?

909 ↗(On Diff #38723)

I'll use real types instead.

ruiu added inline comments.Oct 29 2015, 7:58 AM
ELF/LinkerScript.cpp
250 ↗(On Diff #38723)

By duplicate, do you mean this part? I don't think this is duplicate.

expect("(");
while (!skip(")") {
ELF/Writer.cpp
614 ↗(On Diff #38723)

Why do you need RegularSections in the first place? What is this !RegularSections.(A)||!RegularSections.count(B) condition for?

ELF/Writer.cpp
614 ↗(On Diff #38723)

The intent is to sort 'specific' sections like .symtab, .strtab and others with default sorting order even if some of these names are defined in linker script. Can't figure out why the tests pass without this check as I thought I specifically checked that case. Will look at this more.

ruiu edited edge metadata.Oct 29 2015, 12:20 PM
ruiu added a subscriber: ruiu.

2015/10/29 8:31 "Denis Protivensky" <dprotivensky@accesssoftek.com>:

denis-protivensky added inline comments.

Comment at: ELF/Writer.cpp:614
@@ -574,3 +613,3 @@

  • std::stable_sort(OutputSections.begin(), OutputSections.end(),
  • compareOutputSections<ELFT>);

+ std::stable_sort(

+ OutputSections.begin(), OutputSections.end(),

ruiu wrote:

denis-protivensky wrote:

ruiu wrote:

I don't like to add more code to createSections() which is already

too long. Move this lambda out of this function.

I need RegularSections, which is local to the method. It's captured

in the lambda, and I think it would be ugly to pass it to the comparison
function along with two output sections to compare. Any ideas how to make
it better?

Why do you need RegularSections in the first place? What is this

!RegularSections.(A)||!RegularSections.count(B) condition for?

The intent is to sort 'specific' sections like .symtab, .strtab and

others with default sorting order even if some of these names are defined
in linker script. Can't figure out why the tests pass without this check as
I thought I specifically checked that case. Will look at this more.

I don't think we need that protection. If users try to reorder such
sections, we should do that as instructed, no?

http://reviews.llvm.org/D14140

I don't think we need that protection. If users try to reorder such
sections, we should do that as instructed, no?

First of all, I reproduced the case when defining 'specific' section in the SECTIONS command affects its ordering in the resulting file without the check, so this check works as I expected.

Concerning ordering of 'specific' sections:
The goal of the SECTIONS command is to layout sections from input files to the corresponding output sections. 'Specific' sections as .got, .plt, .interp and most others don't have input sections at all and are generated by the linker logic. This means that we cannot affect the contents of such sections, and the output section description command of format

.interp : { *(.interp) }

is void because there are no input .interp sections, and such empty output sections are discarded by linker rules. The only way this format of command worked is to define custom .interp section (possibly of SHT_PROGBITS/_NOBITS type). Then it will be considered during layout, but that's a specific case and it's checked in tests (with .shstrtab custom section).
I conclude that output section descriptions affect only PROGBITS/NOBITS section types, which are represented as OutputSection/MergeOutputSection in lld, and we currently create only these sections from output section descriptions.
That's why I think the ordering should also affect only these sections, which are stored in RegularSections map in code.

Just as a note: ld doesn't seem as well to reorder 'specific' sections even if defined in linker script.

ELF/LinkerScript.cpp
250 ↗(On Diff #38723)

Do you think this code:

expect("(");
while (!skip(")")) {
  StringRef Tok = next();
  if (Tok == "EXCLUDE_FILE") {
    expect("(");
    while (!skip(")"))
      InSec.ExcludeFiles.push_back(next());
  } else
    InSec.Names.push_back(Tok);
}

is simpler and more clear than the original one:

mapBraces("(", ")", [this, &InSec]() {
  StringRef Tok = next();
  if (Tok == "EXCLUDE_FILE")
    mapBraces("(", ")",
              [this, &InSec]() { InSec.ExcludeFiles.push_back(next()); });
  else
    InSec.Names.push_back(Tok);
});

?

ruiu added a comment.Oct 31 2015, 8:48 AM

I think this falls in the category of undefined behaviors if there was a
spec of the linker script. Since there's no standard or specification for
the linker script, we need to do something that we think reasonable. A
pitfall we want to avoid is to copy every detail of the GNU linker. So can
you please minimize the impact of the patch to the regular flow? I'd
restore code for start/stop symbols and for RegularSections, and
reorder "non-regular" sections if instructed to do so.

ruiu added inline comments.Oct 31 2015, 9:02 AM
ELF/LinkerScript.cpp
250 ↗(On Diff #38723)

Yes, I do think so (as a C/C++ programmer and as a Scheme programmer). The former is much more straightforward C++ and easier to understand than the latter, especially for people who are not very familiar with all the code in this file.

denis-protivensky edited edge metadata.

Updated:

  • Replace mapBraces method with procedural checks in the linker script parser
  • Change peek() -> skip() in the parser
  • Remove custom bounding symbols processing for linker script
  • Remove checks for regular section ordering for linker script
  • Change section comparison lambda with method
  • Remove not used fields of input section description struct
ruiu added inline comments.Nov 2 2015, 12:41 PM
ELF/Config.h
33–35 ↗(On Diff #38891)

A struct with one field doesn't have much value. Please remove.

ELF/LinkerScript.cpp
50–54 ↗(On Diff #38891)

Move readInputSectionDescription and readOutputSectionDescription at the end of this list (after one blank line) so that helper functions are not mixed with readers for top-level constructs.

152–153 ↗(On Diff #38891)

Remove this blank line and a comment. (If you really want to add a comment to this short function, write that as a function comment, but I think we don't need that.)

245–247 ↗(On Diff #38891)

You can remove {}.

ELF/Writer.cpp
435 ↗(On Diff #38891)

Do you really need to check for that? If it's empty, find() will not find anything, so this check seems extraneous.

453 ↗(On Diff #38891)

Rename discardInputSection -> isDiscarded

456 ↗(On Diff #38891)

Is this check needed?

463 ↗(On Diff #38891)

Is this needed?

622–626 ↗(On Diff #38891)

This may be able to be written as

std::stable_sort(OutputSections.begin(), OutputSections.end(), compareSections);
denis-protivensky marked 13 inline comments as done.

Updated:

  • Style fixes in linker script
  • Remove InputSectionDescription
  • Remove hasCustomSections check
  • Rename discardInputSection -> isDiscarded
  • Make compareSections static
ruiu added inline comments.Nov 3 2015, 11:13 AM
ELF/Writer.cpp
550–557 ↗(On Diff #39034)
for (const std::unique_ptr<ObjectFile<ELFT>> &F : Symtab.getObjectFiles())
  for (InputSectionBase<ELFT> *C : F->getSections())
    if (!isDiscarded(C))
      if (auto *S = dyn_cast<InputSection<ELFT>>(C))
        scanRelocs(*S);
906–911 ↗(On Diff #39034)

Looks like OutputSectionDescription doesn't have to be a struct. We can make Config->OutputSections of type StringMap<StringRef> which maps input section names to output section names.

rafael edited edge metadata.Nov 5 2015, 8:50 AM
rafael added a subscriber: rafael.

First of all, I reproduced the case when defining 'specific' section in the SECTIONS command affects its ordering in the resulting file without the check, so this check works as I expected.

Concerning ordering of 'specific' sections:
The goal of the SECTIONS command is to layout sections from input files to the corresponding output sections. 'Specific' sections as .got, .plt, .interp and most others don't have input sections at all and are generated by the linker logic. This means that we cannot affect the contents of such sections, and the output section description command of format

.interp : { *(.interp) }

is void because there are no input .interp sections, and such empty output sections are discarded by linker rules. The only way this format of command worked is to define custom .interp section (possibly of SHT_PROGBITS/_NOBITS type). Then it will be considered during layout, but that's a specific case and it's checked in tests (with .shstrtab custom section).
I conclude that output section descriptions affect only PROGBITS/NOBITS section types, which are represented as OutputSection/MergeOutputSection in lld, and we currently create only these sections from output section descriptions.
That's why I think the ordering should also affect only these sections, which are stored in RegularSections map in code.

Just as a note: ld doesn't seem as well to reorder 'specific' sections even if defined in linker script.

It is not clear if this is needed by real program. For now just leave
it out to have a simpler code. Even if it is needed, it would be
better to just organize the code as

  • add sections from input files
  • sort
  • add linker generated sections
  • sort the end + merge

no?
Cheers,
Rafael

It is not clear if this is needed by real program. For now just leave
it out to have a simpler code. Even if it is needed, it would be
better to just organize the code as

  • add sections from input files
  • sort
  • add linker generated sections
  • sort the end + merge

no?

I'll leave simpler code, thanks.

ELF/Writer.cpp
550–557 ↗(On Diff #39034)

Ok.

906–911 ↗(On Diff #39034)

I can't substitute MapVector with StringMap since I need to know original ordering of output sections defined in linker script so I can sort them accordingly.
I'll remove the OutputSectionDescription structure and use MapVector<StringRef, std::vector<StringRef>> instead.

denis-protivensky edited edge metadata.
denis-protivensky marked 2 inline comments as done.

Updated:

  • Remove OutputSectionDescription structure
  • Simplify checks in scan relocs loop
ruiu added a comment.Nov 9 2015, 11:31 AM

One question is the performance with this patch. Does this slow down the linker? If so, how much?

ELF/Writer.cpp
935 ↗(On Diff #39689)

Can you avoid using a word "parse" here? Linker scripts are and should be parsed by LinkerScript.cpp, so every time I see this name, it feels like a domain error. I don't have a good suggestion, but I'd probably name createSectionMap or something.

denis-protivensky marked an inline comment as done.

Updated:

  • Rename parseSectionDescriptions -> buildSectionMap
In D14140#285339, @ruiu wrote:

One question is the performance with this patch. Does this slow down the linker? If so, how much?

The average of 20 iterations of self-linking lld on my machine:

  • without the patch = 420 ms
  • with the patch = 425 ms
ruiu accepted this revision.Nov 11 2015, 1:59 PM
ruiu edited edge metadata.

LGTM with a nit.

ELF/Writer.cpp
40 ↗(On Diff #39902)

Please move this to the beginning of Writer<ELFT>::run().

This revision is now accepted and ready to land.Nov 11 2015, 1:59 PM
emaste added a subscriber: emaste.Nov 11 2015, 9:34 PM
This revision was automatically updated to reflect the committed changes.