This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Implement minimal PHDRS parser and section-to-segment assignment
ClosedPublic

Authored by evgeny777 on Jul 5 2016, 8:19 AM.
Tokens
"Mountain of Wealth" token, awarded by Hektve87.

Details

Reviewers
ruiu
ikudrin
Summary

This patch implemts:

a) Very basic support for PHDRS parsing. Only header name, type, FILEHDR and PHDRS attributes are supported
b) Simple section-to-segment assignment algorithm:

  • All segments listed inside PHDRS are created, no matter they contain any section or not
  • Linker script can assign section to one or more segments
  • New PT_LOAD segment is created for orphaned sections with SHF_ALLOC attribute.

Diff Detail

Event Timeline

evgeny777 updated this revision to Diff 62754.Jul 5 2016, 8:19 AM
evgeny777 retitled this revision from to [ELF] Implement minimal PHDRS parser and section-to-segment assignment.
evgeny777 updated this object.
evgeny777 added reviewers: ruiu, ikudrin.
evgeny777 set the repository for this revision to rL LLVM.
evgeny777 added a project: lld.
ruiu added inline comments.Jul 6 2016, 5:33 PM
ELF/LinkerScript.cpp
484

As you can see above, our naming scheme in this class is readFoo where Foo is a command name in the linker script. Because this function reads a PHDRS command, it should be named readPhdrs.

670

Likewise, it should be readPhdrsType.

ELF/Writer.cpp
998–999

Can you move this code to LinkerScript.cpp? We don't expose linker script internals to the Writer. For example, SECTIONS commands are hidden behind LinkerScript::assignAddresses function.

ruiu added inline comments.Jul 6 2016, 5:33 PM
ELF/LinkerScript.cpp
738

Type can be any expression that is evaluated to an integer, but you are handling the case only when there is a literal integer which might be a bit confusing. Can you report an error for now?

ELF/LinkerScript.h
63–64

Rename HasFilehdr and HasPhdrs.

72

PhdrsCommands would be a better name because there are many other header types.

evgeny777 added inline comments.Jul 6 2016, 10:15 PM
ELF/LinkerScript.cpp
738

These three lines are equivalent:

header PT_NULL;
header 0;
header 0x0;
ELF/Writer.cpp
998–999

This requires exposing Phdrs, toPhdrFlags, isRelroSection, needsPtLoad, needsInterpSection and other private Writer stuff (defined in Writer.cpp) to LinkerScript object.

Is this, what really neeeded?

evgeny777 added inline comments.Jul 7 2016, 2:20 AM
ELF/LinkerScript.cpp
738

Sorry, I think I might have misunderstood you.
So you do not want to habdle the case when header type is specified as an integer value (like "header 0x0") for now (may be because it can also be an expression which evaluates to integer) and want to report an error, in case mnemonic name is not matched, right?

evgeny777 updated this revision to Diff 63053.Jul 7 2016, 5:20 AM

I have updated review. All issues fixed, except one (moving createPhdrsUsingSpec to LinkerScript.cpp), because I don't understand very well, how this should be done properly
As I wrote earlier, createPhdrsUsingSpec uses a lot of private stuff from Writer:

  • Phdrs collection
  • Phdr structure definition
  • Many helper functions: neetdsPtLoad, needsPtInterp, isOutputDynamic, isRelroSection and others

All of the stuff above is defined in Writer.cpp and is declared private. Writer itself is a template class, so moving part of its definition to header file might not be a good idea.
So, please suggest how this should be done properly? Move all required stuff to Writer.h header file (may be to lld::elf namespace)? Something else?

ruiu edited edge metadata.Jul 7 2016, 3:00 PM

I think it's worth to export struct Phdr, toPhdrFlags, isOutputDynamic and so on from Writer.h so that we can move the new code to LinkerScript.cpp. Can you try?

ELF/LinkerScript.cpp
407–408

Indices and ItSect.

414

ItHdr

663

As we name other functions, this should be readPhdrs.

734

it -> It

evgeny777 updated this revision to Diff 63209.Jul 8 2016, 5:48 AM
evgeny777 edited edge metadata.

Review updated.
Common stuff moved to Writer.h
createPhdrsUsingSpec moved to LinkerScript as createScriptedPhdrs.
Explicit template instantiation added where needed.

ruiu added inline comments.Jul 8 2016, 2:56 PM
ELF/LinkerScript.cpp
289

Replace getHeaderSpec() with Opt.PhdrsCommands`.

344–345
} else {
480–484

Sort alphabetically.

663

Move this before readSearchDir.

ELF/LinkerScript.h
97

Since this is LinkerScript class, it is obvious that it is "scripted", so I'd name createPhdrs.

99

Make this a private function.

ELF/Writer.cpp
226–229

We have ScriptConfig->DoLayout in this function, so for consistency it is better to dispatch to a LinkerScript function in this file instead of from createPhdrs.

if (Script<ELFT>::X->hasPhdrsCommand())
  Phdrs = Script<ELFT>::X->createPhdrs();
else
  Phdrs = createPhdrs();
494–497

Move them at end of this file just like other template instantiations.

919

getHeaderSpec() is used only to call size(), so it's overkill. Define bool LinkerScript::hasPhdrsCommand() and use it instead.

922

I'd move this line to setPhdrs.

ELF/Writer.h
24–38

Remove blank lines.

42

Do you need this last line in the comment?

evgeny777 added inline comments.Jul 11 2016, 1:27 AM
ELF/Writer.cpp
922

This won't work, because size of program headers is used in Writer<ELFT>::assignAddresses, which is called before setPhdrs. I've moved it to fixHeaders() instead

evgeny777 updated this revision to Diff 63476.Jul 11 2016, 1:29 AM

Review updated

ruiu added inline comments.Jul 11 2016, 2:51 PM
ELF/LinkerScript.cpp
281–282

Why don't you make it a pure function?

std::vector<Phdr> LinkerScript<ELFT>::createPhdrs(ArrayRef...)
290

Use emplace_back.

399

The term HeaderSpec is used only for this function, so it is not easy to understand what it means at first sight. Why don't you name this hasPhdrsCommands?

403

This function needs a function comment about what it is supposed to do.

405

ELF contains various types of headers such as section headers or program headers. So it is probably better to name this getPhdrIndices.

evgeny777 added inline comments.Jul 11 2016, 2:59 PM
ELF/LinkerScript.cpp
290

Unlike emplace, emplace_back doesn't return iterator (it has void return type). Is it really that better?

ruiu added inline comments.Jul 11 2016, 3:01 PM
ELF/LinkerScript.cpp
290

Well, then how about using emplace_back and back? Then you are going to have an object of Phdr type rather than an iterator to Phdrs, which I believe improve readability (by replacing auto with Phdr.)

grimar added a subscriber: grimar.Jul 12 2016, 2:33 AM
evgeny777 updated this revision to Diff 63669.Jul 12 2016, 5:25 AM
evgeny777 added a subscriber: llvm-commits.

Review updated

grimar added inline comments.Jul 12 2016, 6:20 AM
ELF/LinkerScript.cpp
286

I think we do not use int.
Use what you need instead: int32_t I guess.

317
if (Out<ELFT>::EhFrame->empty() || !Out<ELFT>::EhFrameHdr)
  continue;
ELF/LinkerScript.h
19

We usually put lld headers before others ones. Move it before #include "lld/Core/LLVM.h" please.

ELF/Writer.h
36

I think this comment can be 2 lines (did not check though).

test/ELF/linkerscript-phdrs.s
4

Please format to have max 80 symbols per line (you can refer to other testcases we have for reference)

8

please add few spaces to format accordinly.

17

Usually when we do not need - we dont leave hex flags values check.
I mean we you want write:

# CHECK-NEXT:    Flags [
# CHECK-NEXT:      PF_R
# CHECK-NEXT:      PF_W
# CHECK-NEXT:      PF_X
# CHECK-NEXT:    ]
22

Remove excessive empty line.

evgeny777 added inline comments.Jul 12 2016, 6:58 AM
ELF/LinkerScript.cpp
286

ssize_t is not (definitly) int.
It is normally used to hold either size or error.

ELF/Writer.h
36

It's been clang-format'ted

test/ELF/linkerscript-phdrs.s
18

Hmm. Have you tried this:

grep PF_R test/ELF/*.s
evgeny777 updated this revision to Diff 63675.Jul 12 2016, 7:00 AM
evgeny777 removed rL LLVM as the repository for this revision.

Review updated according to comments from grimar

grimar added inline comments.Jul 12 2016, 7:11 AM
ELF/LinkerScript.cpp
286
#if defined(_WIN64)
typedef signed __int64 ssize_t;
#else
typedef signed int ssize_t;
#endif /* _WIN64 */

It is defined in that way for me.
I think class ErrorOr is used for holding errors in llvm.

We have just a few usings for it in llvm and none in lld at all for that type. I think you want to change it to something different and more consistent with lld code.

ELF/Writer.h
36

I am no doubt it was. But I think it can be 2 lines instead of 3.

test/ELF/linkerscript-phdrs.s
18

Nope. I am windows user I even do not have grep :)
But general rule is not to check something that is not important for the testcase.
We are trying to follow it anyways, even if missed something in previous commits.

evgeny777 added inline comments.Jul 12 2016, 7:21 AM
ELF/LinkerScript.cpp
286

Well, ErrorOr<T> looks a way too heavy to represent valid/invalid array index, doesn't it?

grimar added inline comments.Jul 12 2016, 7:25 AM
ELF/LinkerScript.cpp
286

Right. That is why I suggested to use int32_t/uint32_t depending on what you need here. It is what we generally use in lld for such cases.
But we do not use int and never used ssize_t yet I think.

evgeny777 added inline comments.Jul 12 2016, 7:33 AM
ELF/LinkerScript.cpp
286

There is plenty of int and plenty of size_t. Unfortunately I don't know any windows analog for 'grep -w', so just beleive me :)
Yes, there isn't any ssize_t, right, but on the other hand - why not use it?

grimar added inline comments.Jul 12 2016, 7:39 AM
ELF/LinkerScript.cpp
286

Well, I think there should be a reason for using something new, like new type in you code. What is the reason for you ?
We used other types all the way before, so at least there is a reason not to use ssize_t - and that readon is consistency.
At the same time that is just my opinion here, I would not intoduce using of new type, but lets wait for another reviewers comments then.

ruiu added inline comments.Jul 12 2016, 5:12 PM
ELF/LinkerScript.cpp
286

I have a concern about ssize_t because (unlike size_t) it is not in the C standard. It is defined in POSIX, so it may be unavailable on non-Unix systems. So I'd prefer int32_t or just int. (I don't have a strong preference here because it is unrealistic we have more than 2^31 elements.)

628–633

Can you break as early as possible?

if (Tok == ";")
  break;
if (Tok == "FILEHDR")
  ...
ELF/LinkerScript.h
102

Add a blank line before private:.

ELF/Writer.cpp
1303–1311

* Sec -> * (we usually omit parameter names here)

Review updated according to comments from grimar

(~~~~)

Review updated.
Common stuff moved to Writer.h
createPhdrsUsingSpec moved to LinkerScript as createScriptedPhdrs.
Explicit template instantiation added where needed.

(~~~~)

I have updated review. All issues fixed, except one (moving createPhdrsUsingSpec to LinkerScript.cpp), because I don't understand very well, how this should be done properly
As I wrote earlier, createPhdrsUsingSpec uses a lot of private stuff from Writer:

  • Phdrs collection
  • Phdr structure definition
  • Many helper functions: neetdsPtLoad, needsPtInterp, isOutputDynamic, isRelroSection and others

All of the stuff above is defined in Writer.cpp and is declared private. Writer itself is a template class, so moving part of its definition to header file might not be a good idea.
So, please suggest how this should be done properly? Move all required stuff to Writer.h header file (may be to lld::elf namespace)? Something else?

(~~~~)

I have updated review. All issues fixed, except one (moving createPhdrsUsingSpec to LinkerScript.cpp), because I don't understand very well, how this should be done properly
As I wrote earlier, createPhdrsUsingSpec uses a lot of private stuff from Writer:

  • Phdrs collection
  • Phdr structure definition
  • Many helper functions: neetdsPtLoad, needsPtInterp, isOutputDynamic, isRelroSection and others

All of the stuff above is defined in Writer.cpp and is declared private. Writer itself is a template class, so moving part of its definition to header file might not be a good idea.
So, please suggest how this should be done properly? Move all required stuff to Writer.h header file (may be to lld::elf namespace)? Something else?

(~~~~)

Review updated

(~~~~)

evgeny777 updated this revision to Diff 63787.Jul 13 2016, 2:59 AM

Review updated

I wonder if it can be pushed to trunk now?

ruiu added a comment.Jul 18 2016, 8:07 AM

I'm sorry I didn't notice that you have uploaded a new patch. I'm sorry. I'll take a look today.

ruiu accepted this revision.Jul 18 2016, 5:52 PM
ruiu edited edge metadata.

LGTM with these changes. Thanks!

ELF/LinkerScript.cpp
290

Again, it is hard to read if you call both ELF program headers and linker script's PHDRS command "headers". Rename Hdr -> Cmd.

306

continue and break are interchangeable in this context, but mixing them is confusing. Please be consistent -- I'd use break here.

321

Ditto

339

You can remove Script<ELFT>::X->.

732

This should be readOutputSectionPhdrs.

739

Return a vector of StringRefs from this function instead of mutating a member.

752

Mutating the last phdrs command here is too subtle. You want to make this function return a phdr type instead of setting it to a member and returning nothing.

ELF/LinkerScript.h
50

We are handling various types of headers, so "Headers" is confusing. Rename Phdrs.

This revision is now accepted and ready to land.Jul 18 2016, 5:52 PM

(~~~~)

ELF/LinkerScript.cpp
290

(~~~~)

Review updated

(~~~~)

(~~~~)

Review updated

(~~~~)

(~~~~)

Review updated

(~~~~)

In D21995#487820, @ruiu wrote:

LGTM with these changes. Thanks!

(~~~~)

(~~~~)

Review updated

(~~~~)

In D21995#487820, @ruiu wrote:

LGTM with these changes. Thanks!

(~~~~)

Review updated.
Common stuff moved to Writer.h
createPhdrsUsingSpec moved to LinkerScript as createScriptedPhdrs.
Explicit template instantiation added where needed.

(~~~~)

evgeny777 closed this revision.Jul 19 2016, 9:08 AM
In D21995#487820, @ruiu wrote:

LGTM with these changes. Thanks!

[[~~~~]]