This is an archive of the discontinued LLVM Phabricator instance.

[lld] [LinkerScript] Implement semantics for simple sections mappings
AbandonedPublic

Authored by rafaelauler on Feb 26 2015, 11:38 AM.

Details

Reviewers
ruiu
shankarke
Summary

This patch implements the behaviour of the SECTIONS linker script directive,
used to define a custom mapping between input and output sections. Not all
sections constructs are currently supported, but only the ones that do not
use wildcard matching to define section names and that does not sort or use
any other special feature. This patch also adds support for the evaluation of
linker scripts expressions when used to define a new address for an output
section. I added a LIT test as a practical example of which sections
directives are currently supported.

The strategy employed here to change the layout of the output file based on
the linker script file was based on a suggestion by Shankar and involves
using "rule ids".

We start by assigning each linker script "rule" (a mapping between input and
output section) a "rule id" in an incremental fashion: the first rules have
the lowest ids. Afterwards, in the ELF reader, when creating ELF defined atoms,
I check from which input file and input section this file came from, try to
match a linker script rule against it and assign it a "rule id". When done
reading, we should have a "rule id" for each atom, which is essentially a
different "ordinal" number, but crafted in terms of what the linker script
thinks the order should be.

In terms of high-level changes, I created a new class "script::Sema" that owns
all linker script ASTs and the logic for linker script semantics as well.
ELFLinkingContext owns a single copy of Sema, which will be used throughout
the object file reading process (to assign rule ids to atoms) and writing
process (to layout sections as proposed by the linker script).

Other high-level change is that the writer no longer uses a "const" copy of
the linking context. This happens because linker script expressions must be
calculated *while* calculating final virtual addresses, which is a very late
step in object file writing. While calculating these expressions, we need to
update the linker script symbol table (inside the semantics object), and, thus,
we are "modifying our context" as we prepare to write the file.

Diff Detail

Event Timeline

rafaelauler retitled this revision from to [lld] [LinkerScript] Implement semantics for simple sections mappings.
rafaelauler updated this object.
rafaelauler edited the test plan for this revision. (Show Details)
rafaelauler added reviewers: shankarke, ruiu.
rafaelauler added a subscriber: Unknown Object (MLST).
shankarke edited edge metadata.Feb 26 2015, 11:54 AM

Cool!

Few comments :-

a) We should assign rule id's to Sections, and not to DefinedAtoms.

  • The ruleid could be assigned when parsing linker scripts, and you attach a rule id to it.
  • DefinedAtoms should not have this extra member ruleid.
  • So when the AtomSection is queried for the output section from the LinkerScript datastructure, it would return a tuple that contains the (order_of_outputsection, ruleid).

b) All the linker script matching would be not be done in the DefaultLayout class. ScriptLayout was essentially designed to take advantage of linker scripts.

emaste added a subscriber: emaste.Feb 26 2015, 12:04 PM
ruiu edited edge metadata.Feb 26 2015, 12:39 PM

I don't understand why you needed to introduce the notion of "rule id" to sort sections in a specific order. What you are trying to do with this patch is to layout sections in an order specified using linker scripts, right? I understand that we need that much code for linker script evaluation because it's a small programming language, but I'm not convinced that attaching a "rule id" to each DefinedAtom helps implementing section ordering.

It seems that we can just have a simple renaming map, which is a map from-sections (e.g. foo.o:.foo) to to-sections (.text), and look up that map in the writer then the writer makes a final decision about which atoms need to be put in which section. At least it seems it should be enough to make the test that you wrote pass.

Rui,

There are more complicated examples than that where ruleid's are required, It makes the design scale a lot, than keeping more data structures IMO.

For example :

.data : { *(.data1 *.data2 *.data3) } is very hard to implement with one single map.

Say you have say you have two objects 1.o 2.o, which contains data1, data2, data3 as sections.

1.o { data1->a, data2->b, data3->c } and 2.o { data1->d data2->e data2->f }

The (->) corresponds to the section containing the defined atom.

The Layout would need to be a,b,c,d,e,f instead of a,d,b,e,c,f.

Rule id is less costly than using a map IMO.

ruiu added a comment.Feb 27 2015, 3:28 PM

This patch contains both a linker script expression evaluator and and a user of that feature. Both are rather large, so maybe I'd split it into two patches. Can you do that? The evaluator can be tested using GnuLdDriverTest.cpp.

ruiu added a comment.Feb 27 2015, 3:28 PM

This patch contains both a linker script expression evaluator and and a user of that feature. Both are rather large, so maybe I'd split it into two patches. Can you do that? The evaluator can be tested using GnuLdDriverTest.cpp.

Hi Rui and Shankar,

Thanks for your suggestions, I will work on them. I will also split this
patch, as suggested by Rui.

rafaelauler abandoned this revision.Mar 9 2015, 6:33 AM