This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Implemented basic location counter support.
ClosedPublic

Authored by grimar on Mar 27 2016, 7:16 AM.

Details

Summary

This patch implements location counter support.
It also separates assign addresses for sections to assignAddressesScript() if it scipt exists.

Main testcase is test/ELF/linkerscript-locationcounter.s, It contains some work with location counter. It is basic now.
Implemented location counter assignment and '+' operations.

Diff Detail

Repository
rL LLVM

Event Timeline

grimar updated this revision to Diff 51734.Mar 27 2016, 7:16 AM
grimar retitled this revision from to [ELF] - Implemented prototype of location counter support..
grimar updated this object.
grimar added reviewers: ruiu, rafael.
grimar added subscribers: llvm-commits, grimar.
ruiu edited edge metadata.Mar 28 2016, 8:53 AM

Can you add a test to demonstrate what this patch will do? It doesn't have to be comprehensive at this moment though.

Also, please remove code to support all arithmetic operators but '+' and parentheses. You can start from supporting only '+' and add other operators later.

In D18499#384634, @ruiu wrote:

Can you add a test to demonstrate what this patch will do? It doesn't have to be comprehensive at this moment though.

Also, please remove code to support all arithmetic operators but '+' and parentheses. You can start from supporting only '+' and add other operators later.

Hi, test is already there actually, it is linkerscript-locationcounter.s - that is main demo for this patch.

grimar added a comment.EditedMar 28 2016, 1:54 PM
In D18499#384634, @ruiu wrote:

Can you add a test to demonstrate what this patch will do? It doesn't have to be comprehensive at this moment though.

Also, please remove code to support all arithmetic operators but '+' and parentheses. You can start from supporting only '+' and add other operators later.

I can remove all operators except '!=', '?:', '/', '+' and parentheses because all of them are used in the test mentioned (linkerscript-locationcounter.s).
All cases except one that uses '*' I took from real bsd script, so it is important part of the test which demonstrates real live situations:

# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t
# RUN: echo "SECTIONS { \
# RUN:  . = 0x12341; \
# RUN:  .data : { *(.data) } \
# RUN:  . = ALIGN(. != 0 ? 64 / 8 : 1); \
# RUN:  .text : { *(.text) } \
# RUN:  . = ALIGN(64 / 2); \
# RUN:  .temp : { *(.temp) } \
# RUN:  . = (. + 0x100) * 2;\
# RUN:  .boo : { *(.boo) } \
# RUN: }" > %t.script
ruiu added a comment.Mar 28 2016, 2:09 PM

I want you to remove all but '+' operator both from code and test so that we can focus on "core" part of this patch, which is a mechanism to set offsets to sections.

In D18499#385012, @ruiu wrote:

I want you to remove all but '+' operator both from code and test so that we can focus on "core" part of this patch, which is a mechanism to set offsets to sections.

Ok, I see. Will do.

grimar updated this revision to Diff 51880.Mar 29 2016, 2:30 AM
grimar edited edge metadata.

Addressed Rui's request:

  • Removed all operators except '+'.
  • Removed testcase unrelated to core.
ruiu added inline comments.Mar 29 2016, 10:57 AM
ELF/LinkerScript.cpp
35–37 ↗(On Diff #51880)

If there's an error, report that using error() and just return 0.

85–110 ↗(On Diff #51880)

Since we don't have to handle operator precedence with '+', please remove this from this patch.

ELF/Writer.cpp
1271–1273 ↗(On Diff #51880)

I don't understand this code. Why did you modify Load->H.p_flags?

1335–1337 ↗(On Diff #51880)

Please do not add more code this function. Instead, I think you can split it for linker script and non-linker script, because if a linker script is present, all addresses are assigned according to the linker script.

grimar updated this revision to Diff 52045.Mar 30 2016, 6:25 AM
  • Addressed Rui's comments.

Notes: if D18591 be landed, huge amount of changes from here will gone.

grimar added inline comments.
ELF/LinkerScript.cpp
35–37 ↗(On Diff #51880)

Done.

85–110 ↗(On Diff #51880)

Done.

ELF/Writer.cpp
1271–1273 ↗(On Diff #51880)

That was incorrect logic. I combined everything in one PT_LOAD, the logic I followed was described in D18317 summary but this desicion logic had issues and now I understood that.

We should just not do align first sections in loads to page boundaries if using script and can create as many PT_LOADs as requested.

1335–1337 ↗(On Diff #51880)

Separated in D18591

ruiu added inline comments.Mar 30 2016, 10:30 AM
ELF/LinkerScript.cpp
36 ↗(On Diff #52045)

Let's make Tokens const. I don't usually add const to every place which we can add const, but adding const improves readability in this case (to convey message that you are passing it as a reference just to avoid creating a copy of the vector.)

113 ↗(On Diff #52045)

So, it looks like we need to call this function with section names in the same order as the section names appear in the linker script. Is this correct? If so, is that a good API?

It feel to me that this patch doesn't capture a correct abstraction of the writer and the linker script. Both LinkerScript and Writer have only partial knowledge of the section layout, and the interaction between them is not well defined. I'd like to see a different approach.

So, if a linker script has a SECTIONS command, the entire file layout is determined by the linker script, right? Can you pass the list of sections or the symbol table to LinkerScript and let it layout all the things?

grimar added inline comments.Mar 30 2016, 10:47 AM
ELF/LinkerScript.cpp
113 ↗(On Diff #52045)

So, it looks like we need to call this function with section names in the same order as the section names appear in the linker script. Is this correct?

No, that not correct.

We are calling this when already have a list of Output sections and just need to assign addresses:

for (OutputSectionBase<ELFT> *Sec : OutputSections) {
...
    VA = Script->getSectionVA(Sec->getName(), VA);

It does not matter what is the order of sections in script at that time, because we already have a list of outputsections and just iterating over it.
Location counter can not be moved backward, so it can be moved only forward.
We push the VA to script because it uses is to calculate "." variable. It returns back updated location.

ruiu added inline comments.Mar 30 2016, 10:53 AM
ELF/LinkerScript.cpp
113 ↗(On Diff #52045)

So you cannot call this function with a random order, right?

Let's say you have a linker script with this command.

SECTIONS {
  .data : { *(.data) }
  .text : { *(.text) }
}

And if you call getSectionVA(".text", VA) and then getSectionVA(".data", VA), they wouldn't return correct values, no?

grimar added inline comments.Mar 30 2016, 11:12 AM
ELF/LinkerScript.cpp
113 ↗(On Diff #52045)

That is very wierd usecase of this patch. Why would you want to call it in that order ? How can this happen now ? It's just not what this patch intended to handle.

So your idea to have something like:

Script->getSectionVA(Sec->getName());

(without current VA, which we know only during actual assigninAddress() now).
Which will always return the VA no matter from where you are calling it. So we will not need assignAddressScript() at all then, almost, as we will already have all sections layouted.

ruiu added inline comments.Mar 30 2016, 11:21 AM
ELF/LinkerScript.cpp
113 ↗(On Diff #52045)

Yes, it is weird, but that is what I meant.

This function has a hidden state and rely on the caller to call it in some "correct" order. In that sense the caller and the callee are coupled fairly tightly. That is not necessarily be NG, but as an interface between the linker script and the writer, that is probably too tight. I want a simpler API.

grimar added inline comments.Mar 30 2016, 11:25 AM
ELF/LinkerScript.cpp
113 ↗(On Diff #52045)

I meant it is wierd for this patch, in general that is just absolutely different approach, which is also ok.
Ok, I understood your point, will try to look closer how to do what you want tomorrow. Have a good day.

grimar updated this revision to Diff 52217.Mar 31 2016, 8:48 AM
  • Addressed Rui's implementation suggestions.
grimar added inline comments.Mar 31 2016, 8:51 AM
ELF/LinkerScript.cpp
193 ↗(On Diff #52217)

I rewrote the logic and now it seems it works like you wanted (at least it has Script->getSectionVA(StringRef Name) method now).

So this is a new prototype, not well polished, but shows the consept. I don't like separate dummies sections processing, but have not good idea now. Will think about it.

grimar added inline comments.Mar 31 2016, 8:56 AM
ELF/LinkerScript.cpp
172 ↗(On Diff #52217)

This will not be here (ends), I am going to prepare patch to move that like was suggested.

ELF/Writer.cpp
1417 ↗(On Diff #52217)

This also probably can be moved out for all 3 assign*() methods.

grimar added inline comments.Mar 31 2016, 9:06 AM
ELF/LinkerScript.h
71 ↗(On Diff #52217)

And also not sure about that place. I did not use reference because I need vector to be copied. Is it ok or better to do copy manually inside ?

ruiu added a comment.Mar 31 2016, 2:50 PM

This is I think better, although it needs brush-ups in terms of coding style.

ELF/Writer.cpp
1421–1436 ↗(On Diff #52217)

This is very similar to the code in assignAddresses.

grimar updated this revision to Diff 52364.Apr 1 2016, 6:31 AM
  • Cleaned up the code.
  • Addressed review comment.
grimar updated this object.Apr 1 2016, 6:34 AM
grimar added inline comments.
ELF/Writer.cpp
1442 ↗(On Diff #52364)

This will gone if D18691 be landed.

1421–1436 ↗(On Diff #52217)

Fixed.

grimar added inline comments.Apr 1 2016, 7:35 AM
ELF/LinkerScript.cpp
574 ↗(On Diff #52364)

I will put

if (Node.Args.empty())
  error(...)

here in next iteration. We cannot have something like:

. =;

And it is also assumed in code that Args not empty.

grimar updated this revision to Diff 52394.Apr 1 2016, 10:57 AM
grimar marked an inline comment as done.
  • Rebased.
ruiu added inline comments.Apr 1 2016, 11:06 AM
ELF/Writer.cpp
1417 ↗(On Diff #52394)

Why do we still have to call setVA from the Writer although I think the point of this patch is to do it in the linker script?

1421–1425 ↗(On Diff #52394)

Do you still need this?

ruiu added inline comments.Apr 1 2016, 11:06 AM
ELF/LinkerScript.h
71 ↗(On Diff #52364)

Why do you need this function? If the linker script directly assign VAs to sections, we don't need this.

grimar updated this revision to Diff 52399.Apr 1 2016, 11:30 AM
grimar marked an inline comment as done.
  • Addressed review comments.
ELF/LinkerScript.h
71 ↗(On Diff #52394)

If so - yes.

ELF/Writer.cpp
1417 ↗(On Diff #52394)

Misunderstanding :) I thought you want to have Script->getSecVA() method.
Without that all life is easier of course.
Removed whole method.

1421–1425 ↗(On Diff #52394)

Nope. Removed.

ruiu added inline comments.Apr 1 2016, 3:31 PM
ELF/LinkerScript.cpp
36 ↗(On Diff #52399)

It seems that we don't need this class because this class has only one member, LocCounter and you always reset LocCounter before calling evaluate. This class can just be a function that takes the current VA and an expression and returns a new VA.

72 ↗(On Diff #52399)

It is better to move this variable inside LinkerScript.

109 ↗(On Diff #52399)

Does this have to be a lambda?

125 ↗(On Diff #52399)

Why do you have to remove the element here?

267–270 ↗(On Diff #52399)

The concept of dummy sections seems a bit confusing. Could you improve it?

ELF/LinkerScript.h
52–54 ↗(On Diff #52399)

I wouldn't use the same variable for completely different purposes. I'd define two fields, Expr and SectionName, instead.

grimar added inline comments.Apr 2 2016, 2:17 AM
ELF/LinkerScript.cpp
109 ↗(On Diff #52399)

Please see answer in below comment.

125 ↗(On Diff #52399)

I need to visit all output sections to assign addresses for each.
Script can mention only some of them.
There were 2 choices I saw:

  1. Make this lambda and do 2 iterations. First iteration visits everything that was directly mentioned in script. During that it removes the sections if found from global list. After that we will have all unvisited sections in this global list. Then second iteration visits them all.

That gives 2 iterations, but complexity is O(n) in total.

  1. In this variant it is possible to insert all sections that were not mentioned in script to Locations list at the begining of method and avoid using lambda. That would be one loop. Problem here that is that if we have OutputSectionBase list and have Locations lists then how to find those sections for which LocationNode was not created to vizit ?

That would be something like (pseudocode):

for (OutputSectionBase<ELFT> *Sec : S) {
  if (S NOT IN Locations) {
    Locations.push_back(create location for S);
  }
}

That is O(N^2) part of code that can be inserted at the begining of this method, that can help to get rid of labmda and do all in one iteration.

ruiu added inline comments.Apr 2 2016, 7:55 AM
ELF/LinkerScript.cpp
125 ↗(On Diff #52399)

This is a bit too convoluted. It'd need a few trial-and-errors, but it should be able to be written in a straightforward way. I'd probably start making this lambda a separate function.

610 ↗(On Diff #52399)

Exist -> Exists

629 ↗(On Diff #52399)

Remove S.

ELF/LinkerScript.h
48 ↗(On Diff #52399)

I'd make this a enum class and rename members as shown below.

LocNodeType -> Command
LocSet -> Expr
LocOutSet -> Section

ELF/Writer.cpp
228 ↗(On Diff #52399)

Let's name this assignAddresses for consistency with the Writer.

grimar added a comment.Apr 2 2016, 8:06 AM

Ok, thank you for review, I`ll address this all later today/tomorrow I think (except dummies).

ELF/LinkerScript.cpp
267–270 ↗(On Diff #52399)

I guess you mean not only for this patch but whole dummies approach, right ?
We can probably just get rid of them, like we had before ?

ruiu added inline comments.Apr 2 2016, 8:11 AM
ELF/LinkerScript.cpp
267–270 ↗(On Diff #52399)

Yes, I'm talking about the approach itself. I'm not sure what you mean by "just get rid of them" (can we do that? I think we need some replacements for them,) but the point is that the code here for the dummy sections seem a bit too complex, so it needs to be improved.

grimar added inline comments.Apr 2 2016, 8:16 AM
ELF/LinkerScript.cpp
267–270 ↗(On Diff #52399)

Ok. When I say "get rid" I mean that some time ago they were not sections. So we definetely can do that and just the question "how to do this better now ?" left. I`ll try to suggest some patch to change that.

grimar added inline comments.
ELF/LinkerScript.cpp
267–270 ↗(On Diff #52399)

So, my suggestion patch about dummies sections is D18743.

I`ll update/rebase this as soon as D18743 be landed (If will). The last one can help to simplify the code of this I think.

grimar updated this revision to Diff 52893.Apr 7 2016, 3:46 AM
grimar marked 7 inline comments as done.
  • Addressed review comments.
ELF/LinkerScript.cpp
36 ↗(On Diff #52399)

Done.

72 ↗(On Diff #52399)

Since that class not exist, variable also not exist anymore.

125 ↗(On Diff #52399)

Done. Made lambda as separate function + refactored code to have only one iteration.

610 ↗(On Diff #52399)

Done.

629 ↗(On Diff #52399)

Done.

ELF/LinkerScript.h
48 ↗(On Diff #52399)

Done.

52–54 ↗(On Diff #52399)

Done.

ELF/Writer.cpp
228 ↗(On Diff #52399)

Done.

grimar updated this revision to Diff 53242.Apr 11 2016, 7:39 AM
  • Rebased
ruiu added inline comments.Apr 11 2016, 2:14 PM
ELF/LinkerScript.cpp
424 ↗(On Diff #53242)

I think you should remove addLocation function and directly add new elements to Locations vector since you intentionally merge two different control flow into one function and dispatch again using the first argument. That's just confusing.

ELF/LinkerScript.h
47 ↗(On Diff #53242)

Let's make this an enum class because it currently adds "Section" and "Expr" to lld::elf namespace.

48 ↗(On Diff #53242)

Add blank line before this struct definition.

74 ↗(On Diff #53242)

Arg -> Args

grimar updated this revision to Diff 53374.Apr 12 2016, 2:26 AM
  • Addressed review comments.
ELF/LinkerScript.cpp
424 ↗(On Diff #53242)

Removed.

ELF/LinkerScript.h
47 ↗(On Diff #53242)

Done.

48 ↗(On Diff #53242)

Done.

74 ↗(On Diff #53242)

There is no such method anymore.

ruiu added inline comments.Apr 12 2016, 12:59 PM
ELF/LinkerScript.cpp
51–65 ↗(On Diff #53374)

Define this method

uint64_t getInteger(StringRef S) {
  uint64_t V;
  if (S.getAsInteger(0, V)) {
    setError("malformed number: " + V);
    return 0;
 }
  retrurn V;
}

Then you can simplify this to

StringRef Tok = Tokens[I];
if (Tok == ".")
  Result += LocCounter;
else
  Result += getInteger(Tok);
94 ↗(On Diff #53374)

Are you aware of the fact that StringMap copies keys? Maybe DenseMap is more efficient because it doesn't copy StringRefs?

113 ↗(On Diff #53374)

Needs () around &.

131 ↗(On Diff #53374)

This function seems a bit too complicated. This function construct temporary data to feed to setSectionsAddresses. Do we really need to do that? Isn't there any way to make it simpler?

154 ↗(On Diff #53374)

Flip the condition and use early continue.

grimar updated this revision to Diff 53563.Apr 13 2016, 8:13 AM
grimar marked 4 inline comments as done.
  • Addressed review comments.
ELF/LinkerScript.cpp
51–65 ↗(On Diff #53374)

Done.
Btw, few iterations ago I had code like this.
http://reviews.llvm.org/D18499?id=52399#inline-156527
Looks when you requested to make a single function from whole class I understood this too straightforward (combined exactly into one function) :)

94 ↗(On Diff #53374)

I missed that fact. Thanks for pointing on that.
There is no such method and variable anymore.

113 ↗(On Diff #53374)

Done.

131 ↗(On Diff #53374)

Reimplemented in much more straighforward way.

154 ↗(On Diff #53374)

There is no such code anymore.

grimar added inline comments.Apr 13 2016, 10:11 AM
ELF/LinkerScript.cpp
95 ↗(On Diff #53563)

At the same time if we talk about perfomance then I think most narrow place is lookup and not creation of container.
So I can guess StringMap still was faster since it does not recompute hash during lookup.
I didn't check since code was cut off.

ruiu added inline comments.Apr 13 2016, 2:27 PM
ELF/LinkerScript.cpp
98–99 ↗(On Diff #53563)

Although it is technically correct that what we are doing here is different from the doc, I don't think it necessarily mean it needs to be marked as "FIXME". My impression is that if you control section layout using linker scripts, you need to give complete layout. So please remove "FIXME" label from this comment and just state the fact:

Orphan sections are sections present in the input files which are not explicitly placed into the output file by the linker script. We place orphan sections at end of file. Other linkers places them using some heuristics as described in https://sourceware.org/binutils/docs/ld/Orphan-Sections.html#Orphan-Sections.

ELF/LinkerScript.h
69 ↗(On Diff #53563)

fixUp -> fixup

grimar updated this revision to Diff 53682.Apr 14 2016, 3:08 AM
  • Addressed review comments.
ELF/LinkerScript.cpp
98–99 ↗(On Diff #53563)

My impression is that if you control section layout using linker scripts, you need to give complete layout.

Perfect, done ! I was thinking about the same often. For me it was always a bit wierd that linker do something on his own when script is given. That complicates the whole things at the same time privides no strict rules, but only heuristics, what can be confusing.
I was not sure that is acceptable to change that behavior though.

ELF/LinkerScript.h
69 ↗(On Diff #53563)

Done.

ruiu added a comment.Apr 14 2016, 12:26 PM

Please update the commit message. This is not a prototype even though its feature is limited. This is a patch that will be committed as part of the linker.

ELF/LinkerScript.cpp
40 ↗(On Diff #53682)

You probably have to do this

error("malformed number: " + Twine(V))
115 ↗(On Diff #53682)

Call fixupLocations here instead of exporting it and call it from the Writer.

129–130 ↗(On Diff #53682)

This shouldn't happen now, no?

485–492 ↗(On Diff #53682)

Remove.

ELF/LinkerScript.h
68–69 ↗(On Diff #53682)

Make it private.

grimar updated this revision to Diff 53875.Apr 15 2016, 6:35 AM
grimar marked 4 inline comments as done.
  • Addressed review comment.
ELF/LinkerScript.cpp
40 ↗(On Diff #53682)

I think what I want is:

error("malformed number: " + S);

Updated testcase.

115 ↗(On Diff #53682)

Done.

129–130 ↗(On Diff #53682)

This can happen. Imagine asm code:

.global _start
_start:
 nop

and script that contains section description, but this section is absent in code above:

SECTIONS {
 .stub : { *(.stub) }
}

Since there is no such section, OutputSections list will not contain it, but location node will still be created and needs to be skipped then.

485–492 ↗(On Diff #53682)

Done.

ELF/LinkerScript.h
68–69 ↗(On Diff #53682)

Done.

grimar retitled this revision from [ELF] - Implemented prototype of location counter support. to [ELF] - Implemented basic location counter support..Apr 15 2016, 6:38 AM

About D19107 and SectionOrder vector.
What about landing this and eliminating the SectionOrder at all ?

At fact when we have Locations structure, SectionOrder is just a subset. We can use
Locations instead to sort sections in compareSections.

The only problem then - that fixupLocations() should be called much earlier then.
I think previous iteration was better for such purposes.

grimar updated this revision to Diff 53886.Apr 15 2016, 8:00 AM
  • Fixed clang warning.
ruiu accepted this revision.Apr 15 2016, 10:42 AM
ruiu edited edge metadata.

LGTM

This revision is now accepted and ready to land.Apr 15 2016, 10:42 AM
In D18499#402693, @ruiu wrote:

LGTM

Thank you for all these reviews of that !

This revision was automatically updated to reflect the committed changes.