This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Emit __ehdr_start when there is a segment containing program headers
Needs ReviewPublic

Authored by phosek on May 4 2017, 5:36 PM.

Details

Reviewers
ruiu
Summary

The current logic for defining ehdr_start doesn't match the GNU linkers, whereas LLD does not define ehdr_start whenever a linker script is being used, while GNU linkers defined the symbol whenever there's a loadable segment which contains ELF headers. This change modifies LLD to use the same logic as GNU linkers.

This is complicated by the fact that LLD checks undefined symbols while scanning relocations, which happens before binary layout is finalized and program headers have been created. To handle this, we define the __ehdr_start symbol unconditionally, and only check whether it should have been defined after program headers have been created; if not and there are references to this symbol, we report an error.

Fixes https://bugs.llvm.org/show_bug.cgi?id=32367

Diff Detail

Repository
rL LLVM

Event Timeline

phosek created this revision.May 4 2017, 5:36 PM
ruiu added inline comments.May 5 2017, 1:40 PM
ELF/Writer.cpp
1190

What is this doing?

1195

Re-scanning all relocations is probably too slow.

phosek added inline comments.May 5 2017, 2:52 PM
ELF/Writer.cpp
1190

The idea is that ElfSym::EhdrStart is only defined if needed by addOptionalRegular, so if it's nullptr we can ignore it.

1195

I'm aware of that, do you have any ideas on how to do it faster? I don't know if there's any way to find relocations for a given symbol which is what's really needed here. Alternative would be to print an error without giving section and offset info, but that'll make it inconsistent with error reports for regular undefined symbols.

ruiu edited edge metadata.May 5 2017, 2:58 PM

Isn't there any way to determine if a ELF header is going to be included in some segment before scanning relocations?

Or, do you have any problem if you unconditionally define the symbol (and set it to zero if the condition is not satisfied)?

phosek added a comment.May 5 2017, 3:08 PM
In D32887#747666, @ruiu wrote:

Isn't there any way to determine if a ELF header is going to be included in some segment before scanning relocations?

I can add a method to LinkerScript that checks if ELF header is going to included and define __ehdr_start conditionally based on that, would that be fine with you?

ruiu added a comment.May 5 2017, 3:10 PM

It depends on complexity, but worth a try.

phosek updated this revision to Diff 98045.May 5 2017, 8:00 PM
phosek added a comment.May 5 2017, 8:03 PM

That solution isn't going to work because it wouldn't handle case where we only have SECTIONS with enough space for headers. I have added a test case for that. A solution to avoid rescaning all relocations I have implemented is to simply collect all references to __ehdr_start during the existing scan and then report the error if needed.

phosek updated this revision to Diff 98249.May 8 2017, 7:54 PM
ruiu added a comment.May 9 2017, 11:16 AM

I'm reluctant to introduce a new concept of SymbolReference just for ehdr_start symbol which is frankly speaking not that important. You needed these new concepts for defining ehdr_start only when an ELF header is memory-allocated, right? Is there any problem if you always define that symbol and leave it zero if an ELF header is not in an allocated segment?

In D32887#749970, @ruiu wrote:

I'm reluctant to introduce a new concept of SymbolReference just for ehdr_start symbol which is frankly speaking not that important. You needed these new concepts for defining ehdr_start only when an ELF header is memory-allocated, right? Is there any problem if you always define that symbol and leave it zero if an ELF header is not in an allocated segment?

I can use just tuple to avoid defining a new concept, that should be sufficient. The reason I don't like always defining __ehdr_start is because it changes its semantics and hence might break the existing code that relies on the existing semantics. In our case, it'll break the kernel build so we'll need a special case for LLD.

ruiu added a comment.May 9 2017, 12:02 PM

Right, but we already violated the assumption for symbols such as __bss_start as it is always defined even if .bss does not exist.

How are you using __ehdr_start?

In our dynamic linker to find the image start. That use case is not a problem. We have some special binaries like vDSO which are using custom linker script and later post-processed and it's the post-processing stage I'm concerned about. I'd need to try and see if it's going to handle this.

Outside of our codebase, the main issue I can see (other than implementing semantics that differs from GNU linkers) is in code that uses __ehdr_start and uses a linker script. If the linker script is incorrect and the __ehdr_start cannot be defined, with GNU linkers they'll get a linker error. If LLD always defines __ehdr_start, the code will link but segfault at runtime when you try to access __ehdr_start.

ruiu added a comment.May 9 2017, 1:08 PM

These possible compatibility issues could be real concerns, but I think we should go with a simple way which is to define ehdr_start unconditionally. When we need to strike a balance among simplicity, efficiency and compatibility, we sometimes chose simplicity or efficiency over the 100% precise compatibility, and it seems to be working surprisingly well. It feels __ehdr_start is in the same category. We can add more code later if it really needs more complexity.

mcgrathr added a comment.EditedMay 9 2017, 3:56 PM

For the record, the intent of __ehdr_start is exactly that it be defined if and only if the ELF file header and program headers are indeed visible via some PT_LOAD segment (i.e. in the PAGE_TRUNCATE(p_vaddr)..PAGE_ROUND(p_vaddr + p_filesz) region) and not be defined otherwise.
I invented the feature in the GNU linkers, and this is the specification I gave it.
Code can use a weak reference to __ehdr_start and check for &__ehdr_start==0 to handle both cases. For that scenario, resolving it to absolute 0 even when there is a strong reference rather than diagnosing the undefined symbol is probably acceptable in practice.

In LLD, the only time it could ever be omitted is when there is a custom linker script being used. So this is unlikely to come up.

grimar added a subscriber: grimar.May 10 2017, 12:39 AM