This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Linkerscript: Implemented SORT command
ClosedPublic

Authored by grimar on Jul 25 2016, 4:07 AM.

Details

Summary

When the SORT keyword is used, the linker will sort the files or sections into ascending order by name before placing them in the output file.
It is used in FreeBSD script:
https://svnweb.freebsd.org/base/head/sys/conf/ldscript.amd64?revision=284870&view=markup#l139

This is PR28689.

Diff Detail

Repository
rL LLVM

Event Timeline

grimar updated this revision to Diff 65327.Jul 25 2016, 4:07 AM
grimar retitled this revision from to [ELF] - Linkerscript: Implemented SORT command.
grimar updated this object.
grimar added reviewers: ruiu, rafael.
grimar added subscribers: llvm-commits, grimar, evgeny777.
evgeny777 added inline comments.Jul 25 2016, 4:20 AM
ELF/LinkerScript.cpp
137 ↗(On Diff #65327)

What about SORT(*(.data))? How will .data sections be ordered?

664 ↗(On Diff #65327)

I would check for "SORT" in readOutputSectionDescription.

grimar added inline comments.Jul 25 2016, 4:49 AM
ELF/LinkerScript.cpp
137 ↗(On Diff #65327)

In the same order as linker see them ?

evgeny777 added inline comments.Jul 25 2016, 4:50 AM
ELF/LinkerScript.cpp
137 ↗(On Diff #65327)

The should be sorted by file names, I think.

grimar added inline comments.Jul 25 2016, 4:55 AM
ELF/LinkerScript.cpp
137 ↗(On Diff #65327)

SORT is alias for SORT_BY_NAME.
I do not see anything about sorting by filename in that case.
(https://sourceware.org/binutils/docs/ld/Input-Section-Wildcards.html#Input-Section-Wildcards)

evgeny777 added inline comments.Jul 25 2016, 5:00 AM
ELF/LinkerScript.cpp
137 ↗(On Diff #65327)

Here is the quote from URL you posted

When the SORT_BY_NAME keyword is used, the linker will sort the files or sections into ascending order by name before placing them in the output file.

grimar updated this revision to Diff 65336.Jul 25 2016, 6:20 AM
  • Addressed review comments.
ELF/LinkerScript.cpp
137 ↗(On Diff #65336)

Agree.

664 ↗(On Diff #65327)

I prefer my way.
KEEP is just a part of IS description. Input section wildcard can be surrounded by KEEP.
(https://sourceware.org/binutils/docs/ld/Input-Section-Keep.html#Input-Section-Keep).

SORT as well as other attributes like EXCLUDE_FILE is relative to wildcards itself. It is not at the same level as KEEP in my understanding.

evgeny777 added inline comments.Jul 25 2016, 6:37 AM
ELF/LinkerScript.cpp
139 ↗(On Diff #65336)

What exact sorting order should be used? Shouldn't file name have greater priority than section name?

Also how about this one:

*(SORT(*.data.*))

vs this one:

SORT(*)(*.data.*)
grimar added inline comments.Jul 25 2016, 6:57 AM
ELF/LinkerScript.cpp
139 ↗(On Diff #65336)

I think my first iteration was correct. SORT can be applied both to sections and to filenames separatelly. Since I am applying it for sections here, no need to sort filenames.
As you mentioned documentation says "the linker will sort the files or sections into ascending order by name". Its a bit confusing, but says correct thing. files OR sections. I am going to revert this to previous iteration.

grimar updated this revision to Diff 65341.Jul 25 2016, 7:09 AM
  • Reverted to initial revision.
  • Improved testcase.
emaste added inline comments.Jul 25 2016, 7:11 AM
ELF/LinkerScript.cpp
139 ↗(On Diff #65336)

Right, it should sort by file or section names independently I think; the common use case is for ctor and dtor section names (this is the case for FreeBSD) and it's definitely the section name that's important there.

KEEP (*(SORT(.dtors.*)))

For reproducible builds though we do want to make sure the sort is stable / deterministic in the non-sorted field.

emaste added inline comments.Jul 25 2016, 7:12 AM
ELF/LinkerScript.cpp
139 ↗(On Diff #65341)

My comment crossed your update. To be clear, I think the current version of this block is correct.

grimar added inline comments.Jul 25 2016, 7:15 AM
ELF/LinkerScript.cpp
139 ↗(On Diff #65341)

Yep, I understood. Thanks !
Btw I think I never saw sorting by filenames in scripts. I think it is not common and we can postpone the support. What do you think ?

davide added a subscriber: davide.Jul 25 2016, 9:37 AM

Btw I think I never saw sorting by filenames in scripts. I think it is not common and we can postpone the support. What do you think ?

That seems sensible to me. I'm not aware of linker scripts that sort by filename, and I can't really come up with a compelling use case for it.

ruiu added inline comments.Jul 25 2016, 10:43 AM
ELF/LinkerScript.cpp
133 ↗(On Diff #65341)

I don't really want to add more code to this function as it's getting too complicated. Actually it needs to lose weight. Please hold on while I'm trying to split it.

Recent change that introduced LinkerScript<ELFT>::getSectionMap() does not allow
this patch to fit well.
Since I need to know Sort flag state for each InputSectionDescription.
With current code that looks to be impossible.
I am holding this until we have another approach landed.

To clarify what I meant here is that D22683 csuggest some approach, so it would cross my code if I would try to
change something here, so I plan to wait until tomorrow, may be it be landed or some other solution will. If not - I'll suggest the change
requred for this patch to work then.

ruiu edited edge metadata.Jul 26 2016, 11:44 AM

I think you should be able to make this change by changing the return type of getSectionMap from std::vector<std::pair<StringRef, ArrayRef<StringRef>> to std::vector<std::pair<StringRef, BaseCommand *>>. The point is that it is not okay to add more and more code to a single function.

In D22749#496455, @ruiu wrote:

I think you should be able to make this change by changing the return type of getSectionMap from std::vector<std::pair<StringRef, ArrayRef<StringRef>> to std::vector<std::pair<StringRef, BaseCommand *>>. The point is that it is not okay to add more and more code to a single function.

Yep, I thought about that, and that finally makes getSectionMap() to be very close with loop on commands, like it was before. May be getSectionMap() is itself excessive ?

ruiu added a comment.Jul 26 2016, 12:03 PM

I don't know yet, but we need to do something. It's probably nice to slow down a bit and make the code simpler and easier to read.

Anyways, I`ll update this patch tomorrow, using your suggestion for start for example and we can see then how it looks.

I don't know yet, but we need to do something. It's probably nice to slow down a bit and make the code simpler and easier to read.

One little comment, we're now quite close to fully handling the FreeBSD kernel linker script.

What's missing:

  1. AT (D19272)
  2. SORT (D22749)
  3. Filename specification (e.g. KEEP (*crtbegin.o(.ctors)))
  4. EXCLUDE_FILE (D22795)
  5. CONSTRUCTORS
  6. . = assignment in output section (PR28720)

I think CONSTRUCTORS is just a nop for ELF (and we can delete it from the linker script) and PR28720 can also be addressed easily in FreeBSD's linker script instead of changing lld.

At least as far as the FreeBSD base system and kernel is concerned there's not much left to implement, regardless of whether we pause now to refactor and clean up or not.

ELF/LinkerScript.cpp
139 ↗(On Diff #65341)

Yes I think it's sensible to postpone support for sorting by filename.

grimar updated this revision to Diff 65681.Jul 27 2016, 2:00 AM
grimar edited edge metadata.
  • Rebased.
grimar added a comment.EditedJul 27 2016, 2:29 AM
  1. Filename specification (e.g. KEEP (*crtbegin.o(.ctors)))

it is D22852.

#CONSTRUCTORS

Constructors is D22848.

grimar updated this revision to Diff 65691.Jul 27 2016, 3:40 AM
  • Removed redundant methods.
grimar updated this revision to Diff 65692.Jul 27 2016, 3:42 AM
  • Updated comment.
grimar updated this revision to Diff 65706.Jul 27 2016, 4:41 AM
  • Removed unrelated change.
ruiu added inline comments.Jul 27 2016, 3:15 PM
ELF/LinkerScript.cpp
113–123 ↗(On Diff #65706)

This change makes me wonder why we need the Factory in the linker script. We need to create output sections as instructed by linker scripts. But, the factory may create different output sections with the same name. I started to think that this is not correct.

ELF/LinkerScript.h
137 ↗(On Diff #65706)

ArrayRef?

grimar added inline comments.Jul 28 2016, 1:14 AM
ELF/LinkerScript.cpp
113–123 ↗(On Diff #65706)

Do you mean D22516 and following little discussion about Rafael's experiments in maillist ?
We can think about how to fix that, but since gnu linkers work in that way for a long time, it is probably not so urgent now.

grimar updated this revision to Diff 65894.Jul 28 2016, 3:09 AM
  • Addressed review comments + refactored.

By the way, if we create bool Keep field in InputSectionDescription,
that can help to remove the Keep argument from few methods.
I know that Keep is not used after parcing since we have a separate array for keeped sections, but probably that
make code a bit cleaner and can worth that little field.

ELF/LinkerScript.h
137 ↗(On Diff #65706)

Ok. Though it is ArrayRef<std:unique_ptr<...>> then. I am not sure why,
but I feel it is a bit strange combination.

ruiu added inline comments.Jul 28 2016, 12:30 PM
ELF/LinkerScript.cpp
120–123 ↗(On Diff #65894)

This little helper seems to take too much arguments. I'd be nice if you can reduce it (though I have no clear idea how.)

728–730 ↗(On Diff #65894)

Format.

767 ↗(On Diff #65894)

peek().startswith("*")

grimar updated this revision to Diff 66042.Jul 28 2016, 3:36 PM
  • Addressed review comments.
ELF/LinkerScript.cpp
120–123 ↗(On Diff #65894)

I suggest to do sort in separate method then. It helps here + if we will want one day
to support sorting by file name or SORT_BY_ALIGNMENT, we should be able just to extend it/modify.

728–730 ↗(On Diff #65894)

Fixed.

767 ↗(On Diff #65894)

Done (in separate NFC commit).

ruiu added inline comments.Jul 28 2016, 3:43 PM
ELF/LinkerScript.cpp
153 ↗(On Diff #66042)

Candidate for what? Maybe just Sections is better.

162 ↗(On Diff #66042)

I don't think making this a separate function improves readability. This is probably better.

if (InCmd->Sort)
  std::stable_sort(Candidates.begin(), Candidates.end(), compareByName);

where compareByName is a separate file-scope function.

grimar updated this revision to Diff 66045.Jul 28 2016, 3:50 PM
  • Addressed review comments.
ELF/LinkerScript.cpp
153 ↗(On Diff #66042)

For adding :) Changed to Sections.

162 ↗(On Diff #66042)

Done.

grimar updated this revision to Diff 66092.Jul 29 2016, 1:26 AM
  • Just rebase.
rafael edited edge metadata.Jul 29 2016, 7:15 AM

There are at least two independent cleanups. They LGTM. Can you commit them and rebase this?

ELF/LinkerScript.cpp
119 ↗(On Diff #66092)

Can you commit making this a static function as an independent cleanup?

ELF/LinkerScript.h
145 ↗(On Diff #66092)

Can you commit just the change to return more info first?

rafael added inline comments.Jul 29 2016, 7:20 AM
ELF/LinkerScript.cpp
147 ↗(On Diff #66092)

Maybe getSectionMap should do this filtering?

grimar added inline comments.Jul 29 2016, 7:27 AM
ELF/LinkerScript.cpp
119 ↗(On Diff #66092)

This probably intersects with getSectionMap() change, so it probably be a single cleanup patch. Looking.

147 ↗(On Diff #66092)

For this patch this probably be fine, though generally we can need another commands afaik.
Let me quickly prepare separate cleanup patch(s) with this and other changes.

grimar added inline comments.
ELF/LinkerScript.cpp
147 ↗(On Diff #66092)

Ah no, that is probably not fine. That way getSectionMap() should create and return array of InputSectionDescriptions instead of returning array of commands that already exist. Not sure it is better since we anyways will want to access other commands as well.
This cleanup patch is D22962.

grimar updated this revision to Diff 66126.Jul 29 2016, 8:27 AM
grimar edited edge metadata.
  • Rebased.
grimar updated this revision to Diff 66127.Jul 29 2016, 8:30 AM
  • clang-format.
ruiu accepted this revision.Jul 29 2016, 8:38 AM
ruiu edited edge metadata.

LGTM

This revision is now accepted and ready to land.Jul 29 2016, 8:38 AM
This revision was automatically updated to reflect the committed changes.