The default behavior of GNU linkers is to set entry address to 0x0 when there is no start symbol (_start) and entry point is not passed to linker in command line arguments. Unfortunately lld reports undefined symbol error in this case (undefined symbol _start) and fails to link the program. An alternative is to always pass -Wl,-e,0 (or "-e 0"), which is not very convenient.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
I'm not opposed to this but I am curious about the use case for binaries with entry 0 -- do you have a sample consumer?
ELF/Driver.cpp | ||
---|---|---|
552–557 ↗ | (On Diff #69378) | Can you move this after the code adding files to the symbol table? Once you add all files to the symbol table, you know whether _start is resolvable or not, so you can add an undefined symbol _start only when available. In that way, I think you can remove ForcedEntry variable. |
Ed, we have a custom system with proprietary micro kernel and lots of applications/services running on top of it.
The entry point in our executables is always the beginning of RX segment - there is no _start symbol. There are lot
of build scripts for various components and setting "-Wl,-e,0" everywhere is not very convenient solution. That said,
I'm not doing this just for fun )
ELF/Driver.cpp | ||
---|---|---|
552–557 ↗ | (On Diff #69378) | Unfortunately, I can't. The main problem is that in general _start symbol should exist in symtab, before adding files, otherwise LTO can optimize out everything. |
Ed, we have a custom system with proprietary micro kernel and lots of applications/services running on top of it.
The entry point in our executables is always the beginning of RX segment - there is no _start symbol. There are lot
of build scripts for various components and setting "-Wl,-e,0" everywhere is not very convenient solution. That said,
I'm not doing this just for fun )
I didn't think you were doing it for fun and I had no opposition to the change, I was just curious how such an ELF file would be useful since I assumed all users would make use of __start / the entry point. So, sounds good to me.
ELF/Driver.cpp | ||
---|---|---|
552–557 ↗ | (On Diff #69378) | We don't check IsUsedInRegularObj for LTO until every file is read, so what causes it to decide that everything can be gced? |
Moving
if (!Config->Entry.empty()) {
down past
for (std::unique_ptr<InputFile> &F : Files)
only seems to change the order of symbols in the output and should allow you to check if there is a _start in the symbol table already.
BTW, if the input is just
.quad _start
we should still produce an error and I think this patch would not.
Also, looks like bfd default to some other value
d.bfd: warning: cannot find entry symbol _start; defaulting to 0000000000400040
down past
for (std::unique_ptr<InputFile> &F : Files)
As far as I understand, in case of bitcode files, symbols will be added to symtab in a call to addCombinedLtoObject(), which is called after LTO passes, right?
Will check all other issues.
Diff updated. Here I'm not adding entry symbol to symtab unless explicitly told. Two extra things done:
- If entry symbol is being inserted then IsUsedInRegularObj is always set to true to avoid incorrect LTO optimizations
- Extra checks are needed in addLazyArchive and addLazyObject to always create object file if it contains start symbol
The position of _start symbol in symbol table is arbitrary now, that's why some test cases were modified.
Doesn't something like this (https://reviews.llvm.org/D24282) work? I think this is what Rafael and I were suggesting.