Page MenuHomePhabricator

[ELF] Default to entry address 0x0 in case start symbol is not defined and entry point is not specified in command line
ClosedPublic

Authored by evgeny777 on Aug 26 2016, 8:33 AM.

Details

Summary

The default behavior of GNU linkers is to set entry address to 0x0 when there is no start symbol (_start) and entry point is not passed to linker in command line arguments. Unfortunately lld reports undefined symbol error in this case (undefined symbol _start) and fails to link the program. An alternative is to always pass -Wl,-e,0 (or "-e 0"), which is not very convenient.

Diff Detail

Repository
rL LLVM

Event Timeline

evgeny777 updated this revision to Diff 69378.Aug 26 2016, 8:33 AM
evgeny777 retitled this revision from to [ELF] Default to entry address 0x0 in case start symbol is not defined and entry point is not specified in command line.
evgeny777 updated this object.
evgeny777 added a reviewer: ruiu.
evgeny777 set the repository for this revision to rL LLVM.
evgeny777 added a project: lld.
evgeny777 added subscribers: grimar, ikudrin, llvm-commits.
emaste added a subscriber: emaste.Aug 29 2016, 6:49 AM

I'm not opposed to this but I am curious about the use case for binaries with entry 0 -- do you have a sample consumer?

ruiu added inline comments.Aug 29 2016, 11:00 AM
ELF/Driver.cpp
552–557 ↗(On Diff #69378)

Can you move this after the code adding files to the symbol table? Once you add all files to the symbol table, you know whether _start is resolvable or not, so you can add an undefined symbol _start only when available. In that way, I think you can remove ForcedEntry variable.

Ed, we have a custom system with proprietary micro kernel and lots of applications/services running on top of it.
The entry point in our executables is always the beginning of RX segment - there is no _start symbol. There are lot
of build scripts for various components and setting "-Wl,-e,0" everywhere is not very convenient solution. That said,
I'm not doing this just for fun )

evgeny777 added inline comments.Aug 30 2016, 6:04 AM
ELF/Driver.cpp
552–557 ↗(On Diff #69378)

Unfortunately, I can't. The main problem is that in general _start symbol should exist in symtab, before adding files, otherwise LTO can optimize out everything.

Ed, we have a custom system with proprietary micro kernel and lots of applications/services running on top of it.
The entry point in our executables is always the beginning of RX segment - there is no _start symbol. There are lot
of build scripts for various components and setting "-Wl,-e,0" everywhere is not very convenient solution. That said,
I'm not doing this just for fun )

I didn't think you were doing it for fun and I had no opposition to the change, I was just curious how such an ELF file would be useful since I assumed all users would make use of __start / the entry point. So, sounds good to me.

rafael added inline comments.
ELF/Driver.cpp
552–557 ↗(On Diff #69378)

We don't check IsUsedInRegularObj for LTO until every file is read, so what causes it to decide that everything can be gced?

Moving

if (!Config->Entry.empty()) {

down past

for (std::unique_ptr<InputFile> &F : Files)

only seems to change the order of symbols in the output and should allow you to check if there is a _start in the symbol table already.

BTW, if the input is just

.quad _start

we should still produce an error and I think this patch would not.

Also, looks like bfd default to some other value

d.bfd: warning: cannot find entry symbol _start; defaulting to 0000000000400040

down past
for (std::unique_ptr<InputFile> &F : Files)

As far as I understand, in case of bitcode files, symbols will be added to symtab in a call to addCombinedLtoObject(), which is called after LTO passes, right?
Will check all other issues.

evgeny777 updated this revision to Diff 70402.Sep 6 2016, 8:03 AM
evgeny777 removed rL LLVM as the repository for this revision.

Diff updated. Here I'm not adding entry symbol to symtab unless explicitly told. Two extra things done:

  • If entry symbol is being inserted then IsUsedInRegularObj is always set to true to avoid incorrect LTO optimizations
  • Extra checks are needed in addLazyArchive and addLazyObject to always create object file if it contains start symbol

The position of _start symbol in symbol table is arbitrary now, that's why some test cases were modified.

ruiu edited edge metadata.Sep 6 2016, 3:19 PM

Doesn't something like this (https://reviews.llvm.org/D24282) work? I think this is what Rafael and I were suggesting.

evgeny777 updated this revision to Diff 70508.Sep 7 2016, 1:28 AM
evgeny777 edited edge metadata.

Great hint, thanks! Diff updated.

ruiu accepted this revision.Sep 7 2016, 3:32 PM
ruiu edited edge metadata.

LGTM

This revision is now accepted and ready to land.Sep 7 2016, 3:32 PM
This revision was automatically updated to reflect the committed changes.