This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/TextAPI/ELF/
-
llvm/
-
TextAPI/
-
ELF/
-
ELFStub.h
-
lib/TextAPI/ELF/
-
TextAPI/
-
ELF/
-
TBEHandler.cpp
-
test/tools/llvm-elfabi/
-
tools/
-
llvm-elfabi/
-
binary-write-neededlibs.test
-
binary-write-pheaders.test
-
binary-write-sheaders.test
-
binary-write-soname.test
-
binary-write-symbols.test
-
invalid-bin-target.test
-
missing-bin-target.test
1
write-elf32be-ehdr.test
-
write-elf32le-ehdr.test
-
write-elf64be-ehdr.test
-
write-elf64le-ehdr.test
-
tools/llvm-elfabi/
-
llvm-elfabi/
-
ELFObjHandler.h
5/12
ELFObjHandler.cpp
-
llvm-elfabi.cpp

Differential D55864

[elfabi] Write program headers, .dynamic, .dynstr, and .shstrtab
Needs ReviewPublic

Authored by jakehehrlich on Dec 18 2018, 4:30 PM.

Download Raw Diff

Details

Reviewers

mcgrathr
phosek
ruiu
jhenderson
amontanez
plotfi
• espindola

Summary

This patch adds the majority of the work needed for writing binary ELF stub files. While only the dynamic entries are written as contents, most of this patch involves the surrounding framework of an ELF file that allows the .dynamic and .dynstr sections to exist happily.

Added:

Writes section headers.
Writes two program headers (PT_DYNAMIC and a PT_LOAD).
Writes .dynstr (only containing strings from .dynamic).
Writes .dynamic.
Writes .shstrtab.

This patch is very large, and may later be broken down if possible.

Diff Detail

Repository: rL LLVM

Event Timeline

amontanez created this revision.Dec 18 2018, 4:30 PM

Herald added a subscriber: llvm-commits. · View Herald TranscriptDec 18 2018, 4:30 PM

amontanez mentioned this in D55839: [elfabi] Add support for writing ELF header for binary stubs.Jan 18 2019, 12:49 PM

Rebase, split out bigger functions to use helper functions, add a simple test.

Cleaned up significantly, added tests, moved from WIP status to ready for review. This patch is very large. I'm open to splitting it up after a brief discussion on how people would prefer I do that.

amontanez added a parent revision: D55839: [elfabi] Add support for writing ELF header for binary stubs.Jan 23 2019, 12:37 PM

Removing from review queues until I look into write commands requested in D55839

amontanez added a child revision: D57216: [elfabi] Add support for writing dynamic symbols.Jan 24 2019, 6:30 PM

Gonna take this one too.

Herald added a project: Restricted Project. · View Herald TranscriptApr 26 2019, 4:59 PM

Ok so this is the rough idea that I have for doing this. I need to add program headers which turns out to require refactoring this a bit. The general idea takes some explanation. There are multiple ways to think about it however. The goal is for everything to "work itself out" automagically. I'll need to add program headers as many bits of functionality are not easily testable here unless we add them. You'll note that I put the NOBITS sections before the other sections. This doesn't jive well with using a single program header to cover them all and since space is a thing we'd like to constrain here we should support it. Unfortunately I did that as optimization as a means of avoiding the dynamic symbol table needing know the addresses of the NOBITS sections before starting. This is overcomeable using an additional loop however. I think an additional loop would be needed even if you hand rolled however and by adding this loop we can probably avoid maintaining an intermediate vector for the dynamic symbol table. Instead we can compute the overall size first, allocate the FileOutputBuffer, and then write directly into it. I'm not sure the complexity or redundancy is justified by the memory savings and I think the time savings will be pretty minimal.

First Way of thinking about it (As a Build System)

You can think of each Lazy value as being a build step. That build step then references other build steps as dependencies. Before completing this build step it goes on to evaluate the other build steps first. As long as no cyclic dependencies show up it all works out. To ensure that there are no cyclic dependencies we have to methods. Method 1) is the Blackhole boolean which if set to true while a Lazy value is being evaluated so that if it ever gets back to itself it can raise an error. The name Blackhole comes from some literature regarding a different way of thinking about it. Method 2) is to assume the minimum possible at each build step. By assuming the minimum possible at each build step you ensure (generally) that you logically can't have any cycles without the idea as a whole being invalid. This is not completely true for lots of little reasons but the main issue is that assumptions produce faster code. Right now I have the number of loops over the symbols down to just 2 which is the minimum I think is possible if you were to hand roll this as is.

Second Way of thinking about it (As futures)

You can think of each Lazy value as a promise to compute that at some point. Each future is then resolve as soon as it can be but it might have to wait on another future. Deadlock occurs when there's a cycle. In fact if we used std::future with delayed evaluation instead of Lazy this would work and we could process the graph in parallel. The graph is quite small at the moment since individual symbols are not members of the graph however so I don't think that would be useful.

Third Way of thinking about (As an instance of the Lob combinator)

My background before working in systems was in programming language theory. There's something called the Lob combinator which can take a self referential spegetti like data structure and actually produce it with links and all. It has many forms and this is one of them. The Lob combinator would evaluate all of the nodes (a node in such a structure is equivalent to a node in a the build graph from Way 1) but we only really need a few. The Lob combinator, in its truely magical form, only works with lazy values however. So I'm just replicating that here. Lazy values are represented by "thunks" in practice. The idea of making a thunk a "blackhole" while its being evaluated to catch cycles comes from some readings I did on GHC's implementation of Haskell which lets it dynamically catch some forms of infinite loops that don't make progress (but not all less it solve the halting problem).

Fourth Way of thinking about it (As as an instance of a build system a la "Build Systems a la carte")

This is like a kind of crappy (or more restricted if you'd rather) version of what they outline in that paper for a dynamic build system. The Lob combinator and build systems are thus then related.

Herald added a reviewer: • espindola. · View Herald TranscriptApr 30 2019, 7:16 PM

Herald added subscribers: MaskRay, arichardson, emaste. · View Herald Transcript

Oh and this is the big picture. I'll be splitting this up somehow.

You may have put the rationale somewhere but they are not easily searchable. Please allow me to ask why there is a desire to write yet another ELF emitter. Can't we invest more in existing tools like yaml2obj? Is there any inherent limitation that prevents it from being more dynamic?

llvm/llvm/test/tools/llvm-elfabi/binary-write-pheaders.test
18 ↗	(On Diff #197498)	`{{.*$}}` can just be omitted. Many other tests do this. If you want to use `FileCheck --match-full-lines` to omit the regex in `4096{{$}}`, they have to be kept.
llvm/llvm/test/tools/llvm-elfabi/binary-write-sheaders.test
60 ↗	(On Diff #197498)	It'd be great to get rid of `.shstrtab` and just use `.strtab`. Newer llvm (since 2015) no longer produces `.shstrtab` as a size optimization.
llvm/test/tools/llvm-elfabi/write-elf32be-ehdr.test
7	Change `x86_64` to `powerpc`, `armv7a`, etc to make big-endian more realistic.
llvm/tools/llvm-elfabi/ELFObjHandler.cpp
54	Other than having a mutable `Blackhole`, you can make `Func = []() { llvm_unreachable(...); };` while evaluating the thunk.
87	Early return?
89	`Value.emplace()`.
511	`T` -> `T `
571	What is this section used for?
587	Which architecture uses the section name `.tls`? Did you mean `.tbss`?

MaskRay added subscribers: beanz, grimar.Apr 30 2019, 8:16 PM

I can't upload a new patch right now but I plan on making all recommended changes, even the ones I didn't reply to.

In D55864#1485695, @MaskRay wrote:

You may have put the rationale somewhere but they are not easily searchable. Please allow me to ask why there is a desire to write yet another ELF emitter. Can't we invest more in existing tools like yaml2obj? Is there any inherent limitation that prevents it from being more dynamic?

Good question. yaml2obj doesn't do the same thing and doesn't exist as a library. The point of this binary is to be as small as possible while maintaining linkability and compatibility with tools like llvm-readobj and other such tools. There is some talk of taking the code in llvm-objcopy, getting it up to where it could be used as a library, making a number of other tweaks, and then using that as a more general purpose emission library. That would also not do the same thing however. The trouble is that every ELF emitter has been designed with a specific purpose in mind and most were not designed as libraries. Furthermore when you consider what each use case has to compute and then specify independent of a general emission library it's not clear you really save much. You have the MC code, you have lld's code, you have llvm-objcopy, you have yaml2obj, etc... but they all have their quirks and specific use cases. the MC emiter has an interface that would be unweildy and imprecise here. llvm-objcopy doesn't help with setting these addresses, can't construct dynamic symbol tables in this way, and assumes that it has access to an input file from which its making modifications, doesn't support adding dynamic symbols yet, and generally makes it hard (on purpose) to tweak the memory image of a file. yaml2obj is really only designed to produce test files and is IMO not in a position to replace this tool. Recent work has given it program header support, dynamic symbol table support, and made it get in the way of specifying specific offsets and addresses when you know what you're doing but it still doesn't rise to the level of something we could use here. Even if it did rise to the level of something we could use here we would just wind up constructing its internal format using the same amount of code only to have it go and do the same basic thing.

Here's a claim. The ELFTypes library *is* the universal emitter already. Each of those tools are performing a particular computation that the others are not except perhaps yaml2obj which is really just working its way towards a low fidelity textual representation of ELF files. There are some commonly repeated things here between all of these libraries that could stand to abstracted out but its actually not the majority of the code in any of these tools or libraries. Until a serious proposal for a universal emitter comes up we keep rewriting those common parts. Still 90% of what's common is already solved by the ELFTypes library

llvm/llvm/test/tools/llvm-elfabi/binary-write-pheaders.test
18 ↗	(On Diff #197498)	Yep. Don't worry about the tests for now. I'll need to update them and they're going to change drastically.
llvm/llvm/test/tools/llvm-elfabi/binary-write-sheaders.test
60 ↗	(On Diff #197498)	That's actually what this patch does. These tests are inherited from the previous version of this patch and have not been modified to work yet. After I remove the changes planned part the tests will work. We'll actually need even more tests than what I have here to properly test this since the tests at hand only check a few sections and .dynamic contents.
llvm/tools/llvm-elfabi/ELFObjHandler.cpp
54	Good idea! I think that's actually what GHC does. I'll do that.
511	Yeah there are a bunch of formatting issues. I'll run clang-format on these when I upload next.
571	It's the section in which all defined non-TLS sections are defined. It's synthesized as a NOBITS section to avoid using any space.
587	.tbss would imply that this is writable when it isn't. In practice read-only data is already thread safe so there is no need for a read only TLS section. There is also in general no reason to have a read-only NOBITS section since the content is just zeros. This means there aren't good names to use for what I'm doing here. These section names don't matter functionally however. I'm open to many other names. I just went for short names where a name used in the wild didn't exist.

ormris added a subscriber: ormris.May 1 2019, 2:21 PM

Ok we're getting closer

Formatted everything
Added suggestions
Program headers are now added
All tests now pass
In theory (but almost certainly not in reality) this should represent a minimal viable product with the exception of support for Alignment

What remains:

Add support for Alignment
Add tests for dynamic symbol table, def size, and tls size
Split up into more manageable chunks so that humans can review this. Any review now is still appreciated of course.

MaskRay added inline comments.May 1 2019, 9:52 PM

llvm/tools/llvm-elfabi/ELFObjHandler.cpp
587	In practice read-only data is already thread safe so there is no need for a read only TLS section. So why do you need the non-writable `.tss`?

jakehehrlich marked an inline comment as done.May 2 2019, 1:36 PM

jakehehrlich added inline comments.

llvm/tools/llvm-elfabi/ELFObjHandler.cpp
587	I don't I don't really need it, but I want to minimize size and complexity. The requirements that I found (which might be minimal) You must have a .dynamic section if you have a soname, or dt_needed. Right now this is added even if no soname or needed libraries exist. To test with, inspect, and work with a file with a .dynamic section you need a PT_DYNAMIC segment To have a PT_DYNAMIC segment you need a PT_LOAD which covers it. Permissions on either of these seem not to matter If you have TLS symbols defined in your module you need a TLS section. Using a TLS nobits saves space I have not observed that a TLS segment is needed but I haven't really gotten to testing that out (I will remove it if it isn't needed). I'd like to have only one PT_LOAD that covers everything, not just the PT_DYNAMIC segment (though that is an option). That means .dynstr and .dynsym as well as .dynamic, the TLS section, and the other definition section. For aesthetic reasons I think the permissions on the PT_LOAD should match the permissions on the sections that it covers. Consequently all sections should have the same permissions if we want only one PT_LOAD. The TLS section is arguably not even covered by the PT_LOAD so it might be fair to give it different permissions and call it .tbss but this doesn't give us a proper name for the read only NOBITS section that critically does need to be covered by the PT_LOAD segment. So its an issue of aesthetic tradeoffs (cough bikeshed cough) in the face of some other motivated issues (having one PT_LOAD and using SHT_NOBITS where possible). There are a few worlds I think exist that try their best to meet the aesthetics work out nicely. Make the PT_LOAD writable and thus .dynsym and .dynstr as well (note that .dynamic is sometimes read only and sometimes writeable) Make the PT_LOAD read only and have read-only NOBITS sections in both TLS and non-TLS varities (the current choice) but not have good names and the idea of these existing is otherwise senseless and thus no previous examples of these exist Make the PT_LOAD read only, keep the non-TLS read-only NOBITS, but make the TLS section writeable so that we can call it ".tbss" in an aesthetically pleasing way. Make the PT_LOAD read only, keep the sections read-only, and call them ".rodata" and ".tbss" anyhow even if those things each imply (by convention) a false fact about our sections. As an adendum we could choose slightly different names as well. By choosing new names the aesthetic I violate is that I'm just making up section names (which happens a lot). I think anything else violates something I personally consider more fundamental like consistency, permissions consistency, using standard permissions for the standard sections we do have, etc... I think there's also an argument to be made that using differing section names makes these sorts of binaries easily recognizable.

Is llvm-elfabi used to create interface shared objects? Do you use it to create real(runnable) modules?

If not and if section/segment specification is not required, it can be very simple.
You just need the symbol table and very little other information.
There is one place in lld that checks if the symbol is in a readonly segment (addCopyRelSymbol).
1 PT_LOAD(RWX) + 1 PT_GNU_RELRO works just fine. You don't even need PT_TLS - STT_TLS symbols can be placed in an arbitrary section. Linkers and various binutils don't validate PT_TLS.

Using a TLS nobits saves space

The best layout I can think of is:
PT_LOAD(PT_GNU_RELRO(.data.rel.ro PT_TLS(.tdata | .tbss) .bss.rel.ro) | .data .bss)

Currently lld has (.tbss cannot really be SHT_NOBITS due to .data.rel.ro):
PT_LOAD(PT_GNU_RELRO(PT_TLS(.tdata .tbss) | .data.rel.ro .bss.rel.ro) | .data .bss)

There is much to say on this topic.

(note that .dynamic is sometimes read only and sometimes writeable)

Do you mean ld.lld -z rodynamic (D33251 http://lists.llvm.org/pipermail/llvm-dev/2017-May/113258.html)? (.dynamic is in the PT_GNU_RELRO region)

In D55864#1490016, @MaskRay wrote:

Is llvm-elfabi used to create interface shared objects? Do you use it to create real(runnable) modules?

If not and if section/segment specification is not required, it can be very simple.
You just need the symbol table and very little other information.
There is one place in lld that checks if the symbol is in a readonly segment (addCopyRelSymbol).
1 PT_LOAD(RWX) + 1 PT_GNU_RELRO works just fine. You don't even need PT_TLS - STT_TLS symbols can be placed in an arbitrary section. Linkers and various binutils don't validate PT_TLS.

Yes this is a stub reader/writer (also textual stubs which can't be linked against are a second format that has already been reviewed largely) but there are other concerns than just what works for lld, namely what works for llvm-readobj. I don't think its worth adding the PT_GNU_RELRO segment. What's the advantage? Also I'll get rid of the PT_TLS segment. The layout I have in the latest patch is

1 PT_LOAD (R)

.dynstr
.dynsym
1 PT_DYNAMIC (R)
  .dynamic 
.def
.tls

What the flags are don't really matter too much to me. I wasn't aware of the linker checking for a readonly segment but I think that favors just making everything read only. I'll think about removing the .tls section as well.

Using a TLS nobits saves space

The best layout I can think of is:
PT_LOAD(PT_GNU_RELRO(.data.rel.ro PT_TLS(.tdata | .tbss) .bss.rel.ro) | .data .bss)

What do you need the PT_GNU_RELRO for? What is .tdata or .tbss for? What is .data.rel.ro for?

Currently lld has (.tbss cannot really be SHT_NOBITS due to .data.rel.ro):
PT_LOAD(PT_GNU_RELRO(PT_TLS(.tdata .tbss) | .data.rel.ro .bss.rel.ro) | .data .bss)

I don't follow.

(note that .dynamic is sometimes read only and sometimes writeable)

Do you mean ld.lld -z rodynamic (D33251 http://lists.llvm.org/pipermail/llvm-dev/2017-May/113258.html)? (.dynamic is in the PT_GNU_RELRO region)

Yep that's what I mean. The .dynamic section is generally relro but can be read only if that flag is used. We shouldn't add a PT_GNU_RELRO just to be consistent with one particular way of doing things. .dynamic is relro because of DT_DEBUG which is not always used. If it isn't used it is possible to make it read only. What add an extra 50 or so bytes when we don't really need to?

Ok so I went digging on the PT_TLS and tls section issues. Seems like removing both the segment and the section is the right way to go.

llvm-readobj currently checks that SHT_TLS sections are in a PT_TLS so we should either have both or none. lld checks that TLS symbols are TLS consistently across the shared object and the output object so we have to use STT_TLS symbols for them (this is kind of obvious). I think lld (as you already stated) imposes no other requirements on this tool. Nothing seems to check that STT_TLS symbols are defined inside of of SHF_TLS sections so we should be good to go there.

Herald added a subscriber: hiraditya. · View Herald TranscriptMay 3 2019, 3:10 PM

@MuskRay Have you checked other linkers or tools?

No more TLS stuff but STT_TLS symbol types are preserved which appears to be sufficient for all known use cases
Fixed all the tests, accidental extra tests, minor issues, and made all tests pass.

I think this is ready for review but I'm not going to split it up until Monday.

The following is the first split. Subsequent splits will be much smaller (adding support for program headers, adding support for soname and needed, adding support for symbols, etc...): https://reviews.llvm.org/D61767#change-LCioCf1j7osU

amontanez mentioned this in D61767: [llvm-elfabi] Emit ELF header and string table section.Jun 5 2019, 11:07 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

TextAPI/

ELF/

ELFStub.h

1 line

lib/

TextAPI/

ELF/

TBEHandler.cpp

3 lines

test/

tools/

llvm-elfabi/

binary-write-neededlibs.test

20 lines

binary-write-pheaders.test

36 lines

binary-write-sheaders.test

87 lines

binary-write-soname.test

12 lines

binary-write-symbols.test

80 lines

invalid-bin-target.test

10 lines

missing-bin-target.test

10 lines

write-elf32be-ehdr.test

28 lines

write-elf32le-ehdr.test

28 lines

write-elf64be-ehdr.test

28 lines

write-elf64le-ehdr.test

28 lines

tools/

llvm-elfabi/

ELFObjHandler.h

17 lines

ELFObjHandler.cpp

533 lines

llvm-elfabi.cpp

33 lines

Diff 198111

llvm/include/llvm/TextAPI/ELF/ELFStub.h

Show All 32 Lines	enum class ELFSymbolType {
// Type information is 4 bits, so 16 is safely out of range.		// Type information is 4 bits, so 16 is safely out of range.
Unknown = 16,		Unknown = 16,
};		};

struct ELFSymbol {		struct ELFSymbol {
ELFSymbol(std::string SymbolName) : Name(SymbolName) {}		ELFSymbol(std::string SymbolName) : Name(SymbolName) {}
std::string Name;		std::string Name;
uint64_t Size;		uint64_t Size;
		uint64_t Alignment;
ELFSymbolType Type;		ELFSymbolType Type;
bool Undefined;		bool Undefined;
bool Weak;		bool Weak;
Optional<std::string> Warning;		Optional<std::string> Warning;
bool operator<(const ELFSymbol &RHS) const {		bool operator<(const ELFSymbol &RHS) const {
return Name < RHS.Name;		return Name < RHS.Name;
}		}
};		};
Show All 20 Lines

llvm/lib/TextAPI/ELF/TBEHandler.cpp

	Show First 20 Lines • Show All 92 Lines • ▼ Show 20 Lines

	/// YAML traits for ELFSymbol.			/// YAML traits for ELFSymbol.
	template <> struct MappingTraits<ELFSymbol> {			template <> struct MappingTraits<ELFSymbol> {
	static void mapping(IO &IO, ELFSymbol &Symbol) {			static void mapping(IO &IO, ELFSymbol &Symbol) {
	IO.mapRequired("Type", Symbol.Type);			IO.mapRequired("Type", Symbol.Type);
	// The need for symbol size depends on the symbol type.			// The need for symbol size depends on the symbol type.
	if (Symbol.Type == ELFSymbolType::NoType) {			if (Symbol.Type == ELFSymbolType::NoType) {
	IO.mapOptional("Size", Symbol.Size, (uint64_t)0);			IO.mapOptional("Size", Symbol.Size, (uint64_t)0);
				IO.mapOptional("Alignment", Symbol.Size, (uint64_t)0);
	} else if (Symbol.Type == ELFSymbolType::Func) {			} else if (Symbol.Type == ELFSymbolType::Func) {
	Symbol.Size = 0;			Symbol.Size = 0;
				Symbol.Alignment = 0;
	} else {			} else {
	IO.mapRequired("Size", Symbol.Size);			IO.mapRequired("Size", Symbol.Size);
				IO.mapOptional("Alignment", Symbol.Alignment, (uint64_t)0);
	}			}
	IO.mapOptional("Undefined", Symbol.Undefined, false);			IO.mapOptional("Undefined", Symbol.Undefined, false);
	IO.mapOptional("Weak", Symbol.Weak, false);			IO.mapOptional("Weak", Symbol.Weak, false);
	IO.mapOptional("Warning", Symbol.Warning);			IO.mapOptional("Warning", Symbol.Warning);
	}			}

	// Compacts symbol information into a single line.			// Compacts symbol information into a single line.
	static const bool flow = true;			static const bool flow = true;
	▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

llvm/test/tools/llvm-elfabi/binary-write-neededlibs.test

This file was added.

				# This test ensures .dynamic strings are added to .dynstr with suffix matching.
				# RUN: llvm-elfabi %s --output-target=elf64-little %t
				# RUN: llvm-readobj --dynamic %t \| FileCheck %s

				--- !tapi-tbe
				TbeVersion: 1.0
				SoName: libsomething.so
				Arch: x86_64
				NeededLibs:
				- libc.so
				- libclang.so
				- thing.so
				Symbols: {}
				...

				# CHECK: SONAME Library soname: [libsomething.so]
				# CHECK: NEEDED Shared library: [libc.so]
				# CHECK: NEEDED Shared library: [libclang.so]
				# CHECK: NEEDED Shared library: [thing.so]
				# CHECK: STRSZ 67 (bytes)

llvm/test/tools/llvm-elfabi/binary-write-pheaders.test

This file was added.

				# RUN: llvm-elfabi %s --output-target=elf64-little %t
				# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefix=ELFHEADER
				# RUN: llvm-readobj -l %t \| FileCheck %s --check-prefix=PHDRS

				--- !tapi-tbe
				TbeVersion: 1.0
				Arch: x86_64
				Symbols: {}
				...

				# ELFHEADER: ProgramHeaderCount: 2

				# PHDRS: ProgramHeader {
				# PHDRS-NEXT: Type: PT_LOAD
				# PHDRS-NEXT: Offset: 0x0
				# PHDRS-NEXT: VirtualAddress: 0x0
				# PHDRS-NEXT: PhysicalAddress: 0x0
				# PHDRS-NEXT: FileSize:
				# PHDRS-NEXT: MemSize:
				# PHDRS-NEXT: Flags [
				# PHDRS-NEXT: PF_R
				# PHDRS-NEXT: ]
				# PHDRS-NEXT: Alignment: 4096
				# PHDRS-NEXT: }
				# PHDRS-NEXT: ProgramHeader {
				# PHDRS-NEXT: Type: PT_DYNAMIC
				# PHDRS-NEXT: Offset:
				# PHDRS-NEXT: VirtualAddress:
				# PHDRS-NEXT: PhysicalAddress:
				# PHDRS-NEXT: FileSize:
				# PHDRS-NEXT: MemSize:
				# PHDRS-NEXT: Flags [
				# PHDRS-NEXT: PF_R
				# PHDRS-NEXT: ]
				# PHDRS-NEXT: Alignment: 8
				# PHDRS-NEXT: }

llvm/test/tools/llvm-elfabi/binary-write-sheaders.test

This file was added.

				# RUN: llvm-elfabi %s --output-target=elf64-little %t
				# RUN: llvm-readobj -h %t \| FileCheck %s --check-prefix=ELFHEADER
				# RUN: llvm-readobj -S %t \| FileCheck %s --check-prefix=SECTIONS

				--- !tapi-tbe
				TbeVersion: 1.0
				Arch: x86_64
				Symbols: {}
				...

				# ELFHEADER: SectionHeaderCount: 5
				# ELFHEADER: StringTableSectionIndex: 1

				# SECTIONS: Section {
				# SECTIONS-NEXT: Index: 0
				# SECTIONS-NEXT: Name: (0)
				# SECTIONS-NEXT: Type: SHT_NULL
				# SECTIONS-NEXT: Flags [
				# SECTIONS-NEXT: ]
				# SECTIONS-NEXT: Address: 0x0
				# SECTIONS-NEXT: Offset: 0x0
				# SECTIONS-NEXT: Size: 0
				# SECTIONS-NEXT: Link: 0
				# SECTIONS-NEXT: Info: 0
				# SECTIONS-NEXT: AddressAlignment: 0
				# SECTIONS-NEXT: EntrySize: 0
				# SECTIONS-NEXT: }
				# SECTIONS-NEXT: Section {
				# SECTIONS-NEXT: Index: 1
				# SECTIONS-NEXT: Name: .dynstr
				# SECTIONS-NEXT: Type: SHT_STRTAB
				# SECTIONS-NEXT: Flags [
				# SECTIONS-NEXT: SHF_ALLOC
				# SECTIONS-NEXT: ]
				# SECTIONS-NEXT: Address:
				# SECTIONS-NEXT: Offset:
				# SECTIONS-NEXT: Size:
				# SECTIONS-NEXT: Link: 0
				# SECTIONS-NEXT: Info: 0
				# SECTIONS-NEXT: AddressAlignment: 0
				# SECTIONS-NEXT: EntrySize: 0
				# SECTIONS-NEXT: }
				# SECTIONS-NEXT: Section {
				# SECTIONS-NEXT: Index: 2
				# SECTIONS-NEXT: Name: .dynsym
				# SECTIONS-NEXT: Type: SHT_DYNSYM
				# SECTIONS-NEXT: Flags [
				# SECTIONS-NEXT: SHF_ALLOC
				# SECTIONS-NEXT: ]
				# SECTIONS-NEXT: Address:
				# SECTIONS-NEXT: Offset:
				# SECTIONS-NEXT: Size: 24
				# SECTIONS-NEXT: Link: 1
				# SECTIONS-NEXT: Info: 0
				# SECTIONS-NEXT: AddressAlignment: 8
				# SECTIONS-NEXT: EntrySize: 24
				# SECTIONS-NEXT: }
				# SECTIONS-NEXT: Section {
				# SECTIONS-NEXT: Index: 3
				# SECTIONS-NEXT: Name: .dynamic
				# SECTIONS-NEXT: Type: SHT_DYNAMIC
				# SECTIONS-NEXT: Flags [
				# SECTIONS-NEXT: SHF_ALLOC
				# SECTIONS-NEXT: ]
				# SECTIONS-NEXT: Address:
				# SECTIONS-NEXT: Offset:
				# SECTIONS-NEXT: Size:
				# SECTIONS-NEXT: Link: 1
				# SECTIONS-NEXT: Info:
				# SECTIONS-NEXT: AddressAlignment: 8
				# SECTIONS-NEXT: EntrySize: 16
				# SECTIONS-NEXT: }
				# SECTIONS-NEXT: Section {
				# SECTIONS-NEXT: Index: 4
				# SECTIONS-NEXT: Name: .def
				# SECTIONS-NEXT: Type: SHT_NOBITS
				# SECTIONS-NEXT: Flags [
				# SECTIONS-NEXT: SHF_ALLOC
				# SECTIONS-NEXT: ]
				# SECTIONS-NEXT: Address:
				# SECTIONS-NEXT: Offset:
				# SECTIONS-NEXT: Size: 0
				# SECTIONS-NEXT: Link: 0
				# SECTIONS-NEXT: Info: 0
				# SECTIONS-NEXT: AddressAlignment: 0
				# SECTIONS-NEXT: EntrySize: 0
				# SECTIONS-NEXT: }

llvm/test/tools/llvm-elfabi/binary-write-soname.test

This file was added.

				# RUN: llvm-elfabi %s --output-target=elf64-little %t
				# RUN: llvm-readobj --dynamic %t \| FileCheck %s

				--- !tapi-tbe
				TbeVersion: 1.0
				SoName: somelib.so
				Arch: x86_64
				Symbols: {}
				...

				# CHECK: SONAME Library soname: [somelib.so]
				# CHECK: STRSZ 42 (bytes)

llvm/test/tools/llvm-elfabi/binary-write-symbols.test

This file was added.

				# RUN: llvm-elfabi %s --output-target=elf64-little %t
				# RUN: llvm-readobj --dynamic --dyn-symbols %t \| FileCheck %s

				--- !tapi-tbe
				TbeVersion: 1.0
				SoName: libfoo.so
				Arch: x86_64
				NeededLibs:
				- libc.so
				- libclang.so
				- thing.so
				Symbols:
				foo: { Type: Func }
				bar: { Type: Object, Size: 42 }
				baz: { Type: Object, Alignment: 64, Size: 8 }
				not: { Type: Object, Undefined: true, Size: 128 }
				nor: { Type: Func, Undefined: true }
				...

				#CHECK: Symbol {
				#CHECK-NEXT: Name:
				#CHECK-NEXT: Value: 0x0
				#CHECK-NEXT: Size: 0
				#CHECK-NEXT: Binding: Local
				#CHECK-NEXT: Type: None
				#CHECK-NEXT: Other: 0
				#CHECK-NEXT: Section: Undefined
				#CHECK-NEXT: }
				#CHECK-NEXT: Symbol {
				#CHECK-NEXT: Name: bar
				#CHECK-NEXT: Value: 0x240
				#CHECK-NEXT: Size: 42
				#CHECK-NEXT: Binding: Global (0x1)
				#CHECK-NEXT: Type: Object (0x1)
				#CHECK-NEXT: Other: 0
				#CHECK-NEXT: Section: .def
				#CHECK-NEXT: }
				#CHECK-NEXT: Symbol {
				#CHECK-NEXT: Name: baz
				#CHECK-NEXT: Value: 0x280
				#CHECK-NEXT: Size: 8
				#CHECK-NEXT: Binding: Global
				#CHECK-NEXT: Type: Object
				#CHECK-NEXT: Other: 0
				#CHECK-NEXT: Section: .def
				#CHECK-NEXT: }
				#CHECK-NEXT: Symbol {
				#CHECK-NEXT: Name: foo
				#CHECK-NEXT: Value: 0x288
				#CHECK-NEXT: Size: 0
				#CHECK-NEXT: Binding: Global
				#CHECK-NEXT: Type: Function
				#CHECK-NEXT: Other: 0
				#CHECK-NEXT: Section: .def
				#CHECK-NEXT: }
				#CHECK-NEXT: Symbol {
				#CHECK-NEXT: Name: nor
				#CHECK-NEXT: Value: 0x0
				#CHECK-NEXT: Size: 0
				#CHECK-NEXT: Binding: Global
				#CHECK-NEXT: Type: Function
				#CHECK-NEXT: Other: 0
				#CHECK-NEXT: Section: Undefined
				#CHECK-NEXT: }
				#CHECK-NEXT: Symbol {
				#CHECK-NEXT: Name: not
				#CHECK-NEXT: Value: 0x0
				#CHECK-NEXT: Size: 128
				#CHECK-NEXT: Binding: Global
				#CHECK-NEXT: Type: Object
				#CHECK-NEXT: Other: 0
				#CHECK-NEXT: Section: Undefined
				#CHECK-NEXT: }

				# CHECK: SONAME Library soname: [libfoo.so]
				# CHECK: NEEDED Shared library: [libc.so]
				# CHECK: NEEDED Shared library: [libclang.so]
				# CHECK: NEEDED Shared library: [thing.so]

llvm/test/tools/llvm-elfabi/invalid-bin-target.test

This file was added.

				# RUN: not llvm-elfabi %s --output-target=nope %t 2>&1 \| FileCheck %s

				--- !tapi-tbe
				SoName: somelib.so
				TbeVersion: 1.0
				Arch: x86_64
				Symbols: {}
				...

				# CHECK: llvm-elfabi: for the -output-target option: Cannot find option named 'nope'!

llvm/test/tools/llvm-elfabi/missing-bin-target.test

This file was added.

				# RUN: not llvm-elfabi %s %t 2>&1 \| FileCheck %s

				--- !tapi-tbe
				SoName: somelib.so
				TbeVersion: 1.0
				Arch: x86_64
				Symbols: {}
				...

				# CHECK: No binary output target specified.

llvm/test/tools/llvm-elfabi/write-elf32be-ehdr.test

This file was added.

				# RUN: llvm-elfabi %s --output-target=elf32-big %t
				# RUN: llvm-readobj --file-headers %t \| FileCheck %s

				--- !tapi-tbe
				TbeVersion: 1.0
				Arch: x86_64
				Symbols: {}
				MaskRayUnsubmitted Not Done Reply Inline Actions Change `x86_64` to `powerpc`, `armv7a`, etc to make big-endian more realistic. MaskRay: Change `x86_64` to `powerpc`, `armv7a`, etc to make big-endian more realistic.
				...

				# CHECK: ElfHeader {
				# CHECK-NEXT: Ident {
				# CHECK-NEXT: Magic: (7F 45 4C 46)
				# CHECK-NEXT: Class: 32-bit (0x1)
				# CHECK-NEXT: DataEncoding: BigEndian (0x2)
				# CHECK-NEXT: FileVersion: 1{{$}}
				# CHECK-NEXT: OS/ABI: SystemV (0x0)
				# CHECK-NEXT: ABIVersion: 0{{$}}
				# CHECK-NEXT: Unused: (00 00 00 00 00 00 00)
				# CHECK-NEXT: }
				# CHECK-NEXT: Type: SharedObject (0x3)
				# CHECK-NEXT: Machine: EM_X86_64 (0x3E)
				# CHECK-NEXT: Version: 1{{$}}
				# CHECK-NEXT: Entry: 0x0{{$}}
				# CHECK: Flags [ (0x0)
				# CHECK-NEXT: ]
				# CHECK-NEXT: HeaderSize: 52{{$}}
				# CHECK-NEXT: ProgramHeaderEntrySize: 32{{$}}
				# CHECK: SectionHeaderEntrySize: 40{{$}}

llvm/test/tools/llvm-elfabi/write-elf32le-ehdr.test

This file was added.

				# RUN: llvm-elfabi %s --output-target=elf32-little %t
				# RUN: llvm-readobj --file-headers %t \| FileCheck %s

				--- !tapi-tbe
				TbeVersion: 1.0
				Arch: x86_64
				Symbols: {}
				...

				# CHECK: ElfHeader {
				# CHECK-NEXT: Ident {
				# CHECK-NEXT: Magic: (7F 45 4C 46)
				# CHECK-NEXT: Class: 32-bit (0x1)
				# CHECK-NEXT: DataEncoding: LittleEndian (0x1)
				# CHECK-NEXT: FileVersion: 1{{$}}
				# CHECK-NEXT: OS/ABI: SystemV (0x0)
				# CHECK-NEXT: ABIVersion: 0{{$}}
				# CHECK-NEXT: Unused: (00 00 00 00 00 00 00)
				# CHECK-NEXT: }
				# CHECK-NEXT: Type: SharedObject (0x3)
				# CHECK-NEXT: Machine: EM_X86_64 (0x3E)
				# CHECK-NEXT: Version: 1{{$}}
				# CHECK-NEXT: Entry: 0x0{{$}}
				# CHECK: Flags [ (0x0)
				# CHECK-NEXT: ]
				# CHECK-NEXT: HeaderSize: 52{{$}}
				# CHECK-NEXT: ProgramHeaderEntrySize: 32{{$}}
				# CHECK: SectionHeaderEntrySize: 40{{$}}

llvm/test/tools/llvm-elfabi/write-elf64be-ehdr.test

This file was added.

				# RUN: llvm-elfabi %s --output-target=elf64-big %t
				# RUN: llvm-readobj --file-headers %t \| FileCheck %s

				--- !tapi-tbe
				TbeVersion: 1.0
				Arch: x86_64
				Symbols: {}
				...

				# CHECK: ElfHeader {
				# CHECK-NEXT: Ident {
				# CHECK-NEXT: Magic: (7F 45 4C 46)
				# CHECK-NEXT: Class: 64-bit (0x2)
				# CHECK-NEXT: DataEncoding: BigEndian (0x2)
				# CHECK-NEXT: FileVersion: 1{{$}}
				# CHECK-NEXT: OS/ABI: SystemV (0x0)
				# CHECK-NEXT: ABIVersion: 0{{$}}
				# CHECK-NEXT: Unused: (00 00 00 00 00 00 00)
				# CHECK-NEXT: }
				# CHECK-NEXT: Type: SharedObject (0x3)
				# CHECK-NEXT: Machine: EM_X86_64 (0x3E)
				# CHECK-NEXT: Version: 1{{$}}
				# CHECK-NEXT: Entry: 0x0{{$}}
				# CHECK: Flags [ (0x0)
				# CHECK-NEXT: ]
				# CHECK-NEXT: HeaderSize: 64{{$}}
				# CHECK-NEXT: ProgramHeaderEntrySize: 56{{$}}
				# CHECK: SectionHeaderEntrySize: 64{{$}}

llvm/test/tools/llvm-elfabi/write-elf64le-ehdr.test

This file was added.

				# RUN: llvm-elfabi %s --output-target=elf64-little %t
				# RUN: llvm-readobj --file-headers %t \| FileCheck %s

				--- !tapi-tbe
				TbeVersion: 1.0
				Arch: AArch64
				Symbols: {}
				...

				# CHECK: ElfHeader {
				# CHECK-NEXT: Ident {
				# CHECK-NEXT: Magic: (7F 45 4C 46)
				# CHECK-NEXT: Class: 64-bit (0x2)
				# CHECK-NEXT: DataEncoding: LittleEndian (0x1)
				# CHECK-NEXT: FileVersion: 1{{$}}
				# CHECK-NEXT: OS/ABI: SystemV (0x0)
				# CHECK-NEXT: ABIVersion: 0{{$}}
				# CHECK-NEXT: Unused: (00 00 00 00 00 00 00)
				# CHECK-NEXT: }
				# CHECK-NEXT: Type: SharedObject (0x3)
				# CHECK-NEXT: Machine: EM_AARCH64 (0xB7)
				# CHECK-NEXT: Version: 1{{$}}
				# CHECK-NEXT: Entry: 0x0{{$}}
				# CHECK: Flags [ (0x0)
				# CHECK-NEXT: ]
				# CHECK-NEXT: HeaderSize: 64{{$}}
				# CHECK-NEXT: ProgramHeaderEntrySize: 56{{$}}
				# CHECK: SectionHeaderEntrySize: 64{{$}}

llvm/tools/llvm-elfabi/ELFObjHandler.h

	Show All 17 Lines
	#include "llvm/TextAPI/ELF/ELFStub.h"			#include "llvm/TextAPI/ELF/ELFStub.h"

	namespace llvm {			namespace llvm {

	class MemoryBuffer;			class MemoryBuffer;

	namespace elfabi {			namespace elfabi {

				enum class ELFTarget {
				ELF32LE,
				ELF32BE,
				ELF64LE,
				ELF64BE
				};

	/// Attempt to read a binary ELF file from a MemoryBuffer.			/// Attempt to read a binary ELF file from a MemoryBuffer.
	Expected<std::unique_ptr<ELFStub>> readELFFile(MemoryBufferRef Buf);			Expected<std::unique_ptr<ELFStub>> readELFFile(MemoryBufferRef Buf);

				/// Attempt to write a binary ELF stub.
				/// This function determines appropriate ELFType using the passed ELFTarget and
				/// then writes a binary ELF stub to a specified file path.
				///
				/// @param FilePath File path for writing the ELF binary.
				/// @param Stub Source ELFStub to generate a binary ELF stub from.
				/// @param OutputFormat Target ELFType to write binary as.
				Error writeBinaryStub(StringRef FilePath, const ELFStub &Stub,
				ELFTarget OutputFormat);

	} // end namespace elfabi			} // end namespace elfabi
	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_TOOLS_ELFABI_ELFOBJHANDLER_H			#endif // LLVM_TOOLS_ELFABI_ELFOBJHANDLER_H

llvm/tools/llvm-elfabi/ELFObjHandler.cpp

//===- ELFObjHandler.cpp --------------------------------------------------===//		//===- ELFObjHandler.cpp --------------------------------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===-----------------------------------------------------------------------===/		//===-----------------------------------------------------------------------===/

#include "ELFObjHandler.h"		#include "ELFObjHandler.h"
		#include "llvm/MC/StringTableBuilder.h"
#include "llvm/Object/Binary.h"		#include "llvm/Object/Binary.h"
#include "llvm/Object/ELFObjectFile.h"		#include "llvm/Object/ELFObjectFile.h"
#include "llvm/Object/ELFTypes.h"		#include "llvm/Object/ELFTypes.h"
#include "llvm/Support/Errc.h"		#include "llvm/Support/Errc.h"
#include "llvm/Support/Error.h"		#include "llvm/Support/Error.h"
		#include "llvm/Support/FileOutputBuffer.h"
		#include "llvm/Support/MathExtras.h"
#include "llvm/Support/MemoryBuffer.h"		#include "llvm/Support/MemoryBuffer.h"
#include "llvm/TextAPI/ELF/ELFStub.h"		#include "llvm/TextAPI/ELF/ELFStub.h"

		#include <functional>

using llvm::MemoryBufferRef;		using llvm::MemoryBufferRef;
using llvm::object::ELFObjectFile;		using llvm::object::ELFObjectFile;

using namespace llvm;		using namespace llvm;
using namespace llvm::object;		using namespace llvm::object;
using namespace llvm::ELF;		using namespace llvm::ELF;

namespace llvm {		namespace {
namespace elfabi {
		using namespace llvm::elfabi;

// Simple struct to hold relevant .dynamic entries.		// Simple struct to hold relevant .dynamic entries.
struct DynamicEntries {		struct DynamicEntries {
uint64_t StrTabAddr = 0;		uint64_t StrTabAddr = 0;
uint64_t StrSize = 0;		uint64_t StrSize = 0;
Optional<uint64_t> SONameOffset;		Optional<uint64_t> SONameOffset;
std::vector<uint64_t> NeededLibNames;		std::vector<uint64_t> NeededLibNames;
// Symbol table:		// Symbol table:
uint64_t DynSymAddr = 0;		uint64_t DynSymAddr = 0;
// Hash tables:		// Hash tables:
Optional<uint64_t> ElfHash;		Optional<uint64_t> ElfHash;
Optional<uint64_t> GnuHash;		Optional<uint64_t> GnuHash;
};		};

		// Lazy assumes that T is default constructable.
		// Lazy also acts like a read-only, move-only type.
		template <class T> class Lazy {
		private:
		mutable std::function<void(T &)> Func;
		mutable Optional<T> Value;

		public:
		MaskRayUnsubmitted Not Done Reply Inline Actions Other than having a mutable `Blackhole`, you can make `Func = []() { llvm_unreachable(...); };` while evaluating the thunk. MaskRay: Other than having a mutable `Blackhole`, you can make `Func = []() { llvm_unreachable(...); };`…
		jakehehrlichAuthorUnsubmitted Done Reply Inline Actions Good idea! I think that's actually what GHC does. I'll do that. jakehehrlich: Good idea! I think that's actually what GHC does. I'll do that.
		Lazy(const Lazy &) = delete;
		Lazy(Lazy &&) = default;
		// Allow Lazy values to be default constructed to an empty state.
		Lazy() = default;
		explicit Lazy(std::function<void(T &)> &&F) : Func(std::move(F)) {}

		Lazy &operator=(std::function<void(T &)> &&F) {
		// Once a thunk has been assigned don't allow it to change.
		assert(!Value && !Func);
		Func = F;
		return *this;
		}
		Lazy &operator=(T &&Val) {
		assert(!Value && !Func);
		Value = std::move(Val);
		return *this;
		}
		Lazy &operator=(const T &Val) {
		assert(!Value && !Func);
		Value = Val;
		return *this;
		}

		const T &operator*() const {
		// Assert that a value has been assigned, lazy or otherwise.
		assert(Value \|\| Func);
		if (Value)
		return *Value;
		std::function<void(T&)> TFunc{std::move(Func)};
		Func = [](T&) { llvm_unreachable("cycle detected"); };
		Value.emplace();
		TFunc(*Value);
		return *Value;
		MaskRayUnsubmitted Not Done Reply Inline Actions Early return? MaskRay: Early return?
		}
		const T operator->() const { return &*this; }
		MaskRayUnsubmitted Not Done Reply Inline Actions `Value.emplace()`. MaskRay: `Value.emplace()`.
		};

		template <class T> Lazy<T> makeLazy(std::function<void(T &)> &&F) {
		return Lazy<T>{std::move(F)};
		}

/// This function behaves similarly to StringRef::substr(), but attempts to		/// This function behaves similarly to StringRef::substr(), but attempts to
/// terminate the returned StringRef at the first null terminator. If no null		/// terminate the returned StringRef at the first null terminator. If no null
/// terminator is found, an error is returned.		/// terminator is found, an error is returned.
///		///
/// @param Str Source string to create a substring from.		/// @param Str Source string to create a substring from.
/// @param Offset The start index of the desired substring.		/// @param Offset The start index of the desired substring.
static Expected<StringRef> terminatedSubstr(StringRef Str, size_t Offset) {		Expected<StringRef> terminatedSubstr(StringRef Str, size_t Offset) {
size_t StrEnd = Str.find('\0', Offset);		size_t StrEnd = Str.find('\0', Offset);
if (StrEnd == StringLiteral::npos) {		if (StrEnd == StringLiteral::npos) {
return createError(		return createError(
"String overran bounds of string table (no null terminator)");		"String overran bounds of string table (no null terminator)");
}		}

size_t StrLen = StrEnd - Offset;		size_t StrLen = StrEnd - Offset;
return Str.substr(Offset, StrLen);		return Str.substr(Offset, StrLen);
Show All 17 Lines

/// This function populates a DynamicEntries struct using an ELFT::DynRange.		/// This function populates a DynamicEntries struct using an ELFT::DynRange.
/// After populating the struct, the members are validated with		/// After populating the struct, the members are validated with
/// some basic sanity checks.		/// some basic sanity checks.
///		///
/// @param Dyn Target DynamicEntries struct to populate.		/// @param Dyn Target DynamicEntries struct to populate.
/// @param DynTable Source dynamic table.		/// @param DynTable Source dynamic table.
template <class ELFT>		template <class ELFT>
static Error populateDynamic(DynamicEntries &Dyn,		Error populateDynamic(DynamicEntries &Dyn, typename ELFT::DynRange DynTable) {
typename ELFT::DynRange DynTable) {
if (DynTable.empty())		if (DynTable.empty())
return createError("No .dynamic section found");		return createError("No .dynamic section found");

// Search .dynamic for relevant entries.		// Search .dynamic for relevant entries.
bool FoundDynStr = false;		bool FoundDynStr = false;
bool FoundDynStrSz = false;		bool FoundDynStrSz = false;
bool FoundDynSym = false;		bool FoundDynSym = false;
for (auto &Entry : DynTable) {		for (auto &Entry : DynTable) {
Show All 32 Lines	if (!FoundDynStrSz) {
return createError(		return createError(
"Couldn't determine dynamic string table size (no DT_STRSZ entry)");		"Couldn't determine dynamic string table size (no DT_STRSZ entry)");
}		}
if (!FoundDynSym) {		if (!FoundDynSym) {
return createError(		return createError(
"Couldn't locate dynamic symbol table (no DT_SYMTAB entry)");		"Couldn't locate dynamic symbol table (no DT_SYMTAB entry)");
}		}
if (Dyn.SONameOffset.hasValue() && *Dyn.SONameOffset >= Dyn.StrSize) {		if (Dyn.SONameOffset.hasValue() && *Dyn.SONameOffset >= Dyn.StrSize) {
return createStringError(		return createStringError(object_error::parse_failed,
object_error::parse_failed,
"DT_SONAME string offset (0x%016" PRIx64		"DT_SONAME string offset (0x%016" PRIx64
") outside of dynamic string table",		") outside of dynamic string table",
*Dyn.SONameOffset);		*Dyn.SONameOffset);
}		}
for (uint64_t Offset : Dyn.NeededLibNames) {		for (uint64_t Offset : Dyn.NeededLibNames) {
if (Offset >= Dyn.StrSize) {		if (Offset >= Dyn.StrSize) {
return createStringError(		return createStringError(object_error::parse_failed,
object_error::parse_failed,
"DT_NEEDED string offset (0x%016" PRIx64		"DT_NEEDED string offset (0x%016" PRIx64
") outside of dynamic string table",		") outside of dynamic string table",
Offset);		Offset);
}		}
}		}

return Error::success();		return Error::success();
}		}

/// This function finds the number of dynamic symbols using a GNU hash table.		/// This function finds the number of dynamic symbols using a GNU hash table.
///		///
/// @param Table The GNU hash table for .dynsym.		/// @param Table The GNU hash table for .dynsym.
template <class ELFT>		template <class ELFT>
static uint64_t getDynSymtabSize(const typename ELFT::GnuHash &Table) {		uint64_t getDynSymtabSize(const typename ELFT::GnuHash &Table) {
using Elf_Word = typename ELFT::Word;		using Elf_Word = typename ELFT::Word;
if (Table.nbuckets == 0)		if (Table.nbuckets == 0)
return Table.symndx + 1;		return Table.symndx + 1;
uint64_t LastSymIdx = 0;		uint64_t LastSymIdx = 0;
uint64_t BucketVal = 0;		uint64_t BucketVal = 0;
// Find the index of the first symbol in the last chain.		// Find the index of the first symbol in the last chain.
for (Elf_Word Val : Table.buckets()) {		for (Elf_Word Val : Table.buckets()) {
BucketVal = std::max(BucketVal, (uint64_t)Val);		BucketVal = std::max(BucketVal, (uint64_t)Val);
Show All 11 Lines

/// This function determines the number of dynamic symbols.		/// This function determines the number of dynamic symbols.
/// Without access to section headers, the number of symbols must be determined		/// Without access to section headers, the number of symbols must be determined
/// by parsing dynamic hash tables.		/// by parsing dynamic hash tables.
///		///
/// @param Dyn Entries with the locations of hash tables.		/// @param Dyn Entries with the locations of hash tables.
/// @param ElfFile The ElfFile that the section contents reside in.		/// @param ElfFile The ElfFile that the section contents reside in.
template <class ELFT>		template <class ELFT>
static Expected<uint64_t> getNumSyms(DynamicEntries &Dyn,		Expected<uint64_t> getNumSyms(DynamicEntries &Dyn,
const ELFFile<ELFT> &ElfFile) {		const ELFFile<ELFT> &ElfFile) {
using Elf_Hash = typename ELFT::Hash;		using Elf_Hash = typename ELFT::Hash;
using Elf_GnuHash = typename ELFT::GnuHash;		using Elf_GnuHash = typename ELFT::GnuHash;
// Search GNU hash table to try to find the upper bound of dynsym.		// Search GNU hash table to try to find the upper bound of dynsym.
if (Dyn.GnuHash.hasValue()) {		if (Dyn.GnuHash.hasValue()) {
Expected<const uint8_t > TablePtr = ElfFile.toMappedAddr(Dyn.GnuHash);		Expected<const uint8_t > TablePtr = ElfFile.toMappedAddr(Dyn.GnuHash);
if (!TablePtr)		if (!TablePtr)
return TablePtr.takeError();		return TablePtr.takeError();
const Elf_GnuHash *Table =		const Elf_GnuHash *Table =
Show All 12 Lines
}		}

/// This function extracts symbol type from a symbol's st_info member and		/// This function extracts symbol type from a symbol's st_info member and
/// maps it to an ELFSymbolType enum.		/// maps it to an ELFSymbolType enum.
/// Currently, STT_NOTYPE, STT_OBJECT, STT_FUNC, and STT_TLS are supported.		/// Currently, STT_NOTYPE, STT_OBJECT, STT_FUNC, and STT_TLS are supported.
/// Other symbol types are mapped to ELFSymbolType::Unknown.		/// Other symbol types are mapped to ELFSymbolType::Unknown.
///		///
/// @param Info Binary symbol st_info to extract symbol type from.		/// @param Info Binary symbol st_info to extract symbol type from.
static ELFSymbolType convertInfoToType(uint8_t Info) {		ELFSymbolType convertInfoToType(uint8_t Info) {
Info = Info & 0xf;		Info = Info & 0xf;
switch (Info) {		switch (Info) {
case ELF::STT_NOTYPE:		case ELF::STT_NOTYPE:
return ELFSymbolType::NoType;		return ELFSymbolType::NoType;
case ELF::STT_OBJECT:		case ELF::STT_OBJECT:
return ELFSymbolType::Object;		return ELFSymbolType::Object;
case ELF::STT_FUNC:		case ELF::STT_FUNC:
return ELFSymbolType::Func;		return ELFSymbolType::Func;
case ELF::STT_TLS:		case ELF::STT_TLS:
return ELFSymbolType::TLS;		return ELFSymbolType::TLS;
default:		default:
return ELFSymbolType::Unknown;		return ELFSymbolType::Unknown;
}		}
}		}

/// This function creates an ELFSymbol and populates all members using		/// This function creates an ELFSymbol and populates all members using
/// information from a binary ELFT::Sym.		/// information from a binary ELFT::Sym.
///		///
/// @param SymName The desired name of the ELFSymbol.		/// @param SymName The desired name of the ELFSymbol.
/// @param RawSym ELFT::Sym to extract symbol information from.		/// @param RawSym ELFT::Sym to extract symbol information from.
template <class ELFT>		template <class ELFT>
static ELFSymbol createELFSym(StringRef SymName,		ELFSymbol createELFSym(StringRef SymName, const typename ELFT::Sym &RawSym) {
const typename ELFT::Sym &RawSym) {
ELFSymbol TargetSym(SymName);		ELFSymbol TargetSym(SymName);
uint8_t Binding = RawSym.getBinding();		uint8_t Binding = RawSym.getBinding();
if (Binding == STB_WEAK)		if (Binding == STB_WEAK)
TargetSym.Weak = true;		TargetSym.Weak = true;
else		else
TargetSym.Weak = false;		TargetSym.Weak = false;

TargetSym.Undefined = RawSym.isUndefined();		TargetSym.Undefined = RawSym.isUndefined();
Show All 9 Lines

/// This function populates an ELFStub with symbols using information read		/// This function populates an ELFStub with symbols using information read
/// from an ELF binary.		/// from an ELF binary.
///		///
/// @param TargetStub ELFStub to add symbols to.		/// @param TargetStub ELFStub to add symbols to.
/// @param DynSym Range of dynamic symbols to add to TargetStub.		/// @param DynSym Range of dynamic symbols to add to TargetStub.
/// @param DynStr StringRef to the dynamic string table.		/// @param DynStr StringRef to the dynamic string table.
template <class ELFT>		template <class ELFT>
static Error populateSymbols(ELFStub &TargetStub,		Error populateSymbols(ELFStub &TargetStub, const typename ELFT::SymRange DynSym,
const typename ELFT::SymRange DynSym,
StringRef DynStr) {		StringRef DynStr) {
// Skips the first symbol since it's the NULL symbol.		// Skips the first symbol since it's the NULL symbol.
for (auto RawSym : DynSym.drop_front(1)) {		for (auto RawSym : DynSym.drop_front(1)) {
// If a symbol does not have global or weak binding, ignore it.		// If a symbol does not have global or weak binding, ignore it.
uint8_t Binding = RawSym.getBinding();		uint8_t Binding = RawSym.getBinding();
if (!(Binding == STB_GLOBAL \|\| Binding == STB_WEAK))		if (!(Binding == STB_GLOBAL \|\| Binding == STB_WEAK))
continue;		continue;
// If a symbol doesn't have default or protected visibility, ignore it.		// If a symbol doesn't have default or protected visibility, ignore it.
uint8_t Visibility = RawSym.getVisibility();		uint8_t Visibility = RawSym.getVisibility();
if (!(Visibility == STV_DEFAULT \|\| Visibility == STV_PROTECTED))		if (!(Visibility == STV_DEFAULT \|\| Visibility == STV_PROTECTED))
continue;		continue;
// Create an ELFSymbol and populate it with information from the symbol		// Create an ELFSymbol and populate it with information from the symbol
// table entry.		// table entry.
Expected<StringRef> SymName = terminatedSubstr(DynStr, RawSym.st_name);		Expected<StringRef> SymName = terminatedSubstr(DynStr, RawSym.st_name);
if (!SymName)		if (!SymName)
return SymName.takeError();		return SymName.takeError();
		// TODO: Populate alignment by calculating it from the section alignment
		// and the offset within the section. For not just set it to zero.
ELFSymbol Sym = createELFSym<ELFT>(*SymName, RawSym);		ELFSymbol Sym = createELFSym<ELFT>(*SymName, RawSym);
		Sym.Alignment = 0;
TargetStub.Symbols.insert(std::move(Sym));		TargetStub.Symbols.insert(std::move(Sym));
// TODO: Populate symbol warning.		// TODO: Populate symbol warning.
}		}
return Error::success();		return Error::success();
}		}

/// Returns a new ELFStub with all members populated from an ELFObjectFile.		/// Returns a new ELFStub with all members populated from an ELFObjectFile.
/// @param ElfObj Source ELFObjectFile.		/// @param ElfObj Source ELFObjectFile.
template <class ELFT>		template <class ELFT>
static Expected<std::unique_ptr<ELFStub>>		Expected<std::unique_ptr<ELFStub>>
buildStub(const ELFObjectFile<ELFT> &ElfObj) {		buildStub(const ELFObjectFile<ELFT> &ElfObj) {
using Elf_Dyn_Range = typename ELFT::DynRange;		using Elf_Dyn_Range = typename ELFT::DynRange;
using Elf_Phdr_Range = typename ELFT::PhdrRange;		using Elf_Phdr_Range = typename ELFT::PhdrRange;
using Elf_Sym_Range = typename ELFT::SymRange;		using Elf_Sym_Range = typename ELFT::SymRange;
using Elf_Sym = typename ELFT::Sym;		using Elf_Sym = typename ELFT::Sym;
std::unique_ptr<ELFStub> DestStub = make_unique<ELFStub>();		std::unique_ptr<ELFStub> DestStub = make_unique<ELFStub>();
const ELFFile<ELFT> *ElfFile = ElfObj.getELFFile();		const ELFFile<ELFT> *ElfFile = ElfObj.getELFFile();
// Fetch .dynamic table.		// Fetch .dynamic table.
Expected<Elf_Dyn_Range> DynTable = ElfFile->dynamicEntries();		Expected<Elf_Dyn_Range> DynTable = ElfFile->dynamicEntries();
if (!DynTable) {		if (!DynTable) {
return DynTable.takeError();		return DynTable.takeError();
}		}

// Fetch program headers.		// Fetch program headers.
Expected<Elf_Phdr_Range> PHdrs = ElfFile->program_headers();		Expected<Elf_Phdr_Range> PHdrs = ElfFile->program_headers();
if (!PHdrs) {		if (!PHdrs) {
return PHdrs.takeError();		return PHdrs.takeError();
}		}

// Collect relevant .dynamic entries.		// Collect relevant .dynamic entries.
DynamicEntries DynEnt;		DynamicEntries DynEnt;
if (Error Err = populateDynamic<ELFT>(DynEnt, *DynTable))		if (Error Err = populateDynamic<ELFT>(DynEnt, *DynTable))
return std::move(Err);		return std::move(Err);

// Get pointer to in-memory location of .dynstr section.		// Get pointer to in-memory location of .dynstr section.
Expected<const uint8_t *> DynStrPtr =		Expected<const uint8_t *> DynStrPtr =
ElfFile->toMappedAddr(DynEnt.StrTabAddr);		ElfFile->toMappedAddr(DynEnt.StrTabAddr);
if (!DynStrPtr)		if (!DynStrPtr)
return appendToError(DynStrPtr.takeError(),		return appendToError(DynStrPtr.takeError(),
"when locating .dynstr section contents");		"when locating .dynstr section contents");

StringRef DynStr(reinterpret_cast<const char *>(DynStrPtr.get()),		StringRef DynStr(reinterpret_cast<const char *>(DynStrPtr.get()),
DynEnt.StrSize);		DynEnt.StrSize);
Show All 27 Lines	if (!SymCount)
return SymCount.takeError();		return SymCount.takeError();
if (*SymCount > 0) {		if (*SymCount > 0) {
// Get pointer to in-memory location of .dynsym section.		// Get pointer to in-memory location of .dynsym section.
Expected<const uint8_t *> DynSymPtr =		Expected<const uint8_t *> DynSymPtr =
ElfFile->toMappedAddr(DynEnt.DynSymAddr);		ElfFile->toMappedAddr(DynEnt.DynSymAddr);
if (!DynSymPtr)		if (!DynSymPtr)
return appendToError(DynSymPtr.takeError(),		return appendToError(DynSymPtr.takeError(),
"when locating .dynsym section contents");		"when locating .dynsym section contents");
Elf_Sym_Range DynSyms =		Elf_Sym_Range DynSyms = ArrayRef<Elf_Sym>(
ArrayRef<Elf_Sym>(reinterpret_cast<const Elf_Sym >(DynSymPtr),		reinterpret_cast<const Elf_Sym >(DynSymPtr), *SymCount);
*SymCount);
Error SymReadError = populateSymbols<ELFT>(*DestStub, DynSyms, DynStr);		Error SymReadError = populateSymbols<ELFT>(*DestStub, DynSyms, DynStr);
if (SymReadError)		if (SymReadError)
return appendToError(std::move(SymReadError),		return appendToError(std::move(SymReadError),
"when reading dynamic symbols");		"when reading dynamic symbols");
}		}

return std::move(DestStub);		return std::move(DestStub);
}		}

		/// This initializes an ELF file header with information specific to a binary
		/// dynamic shared object.
		/// Offsets, indexes, links, etc. for section and program headers are just
		/// zero-initialized as they will be updated elsewhere.
		///
		/// @param ElfHeader Target ELFT::Ehdr to populate.
		/// @param Machine Target architecture (e_machine from ELF specifications).
		template <class ELFT>
		void initELFHeader(typename ELFT::Ehdr &ElfHeader, uint16_t Machine) {
		using Elf_Ehdr = typename ELFT::Ehdr;
		using Elf_Phdr = typename ELFT::Phdr;
		using Elf_Shdr = typename ELFT::Shdr;

		memset(&ElfHeader, 0, sizeof(Elf_Ehdr));
		// ELF identification.
		ElfHeader.e_ident[EI_MAG0] = 0x7f; // ELFMAG0
		ElfHeader.e_ident[EI_MAG1] = 'E'; // ELFMAG1
		ElfHeader.e_ident[EI_MAG2] = 'L'; // ELFMAG2
		ElfHeader.e_ident[EI_MAG3] = 'F'; // ELFMAG3
		ElfHeader.e_ident[EI_CLASS] = ELFT::Is64Bits ? ELFCLASS64 : ELFCLASS32;
		bool IsLittleEndian = ELFT::TargetEndianness == support::little;
		ElfHeader.e_ident[EI_DATA] = IsLittleEndian ? ELFDATA2LSB : ELFDATA2MSB;
		ElfHeader.e_ident[EI_VERSION] = EV_CURRENT;
		ElfHeader.e_ident[EI_OSABI] = ELFOSABI_NONE;
		ElfHeader.e_ident[EI_ABIVERSION] = 0;

		// Remainder of ELF header.
		ElfHeader.e_type = ET_DYN;
		ElfHeader.e_machine = Machine;
		ElfHeader.e_version = EV_CURRENT;
		ElfHeader.e_entry = 0;
		ElfHeader.e_flags = 0;
		ElfHeader.e_ehsize = sizeof(Elf_Ehdr);
		ElfHeader.e_phentsize = sizeof(Elf_Phdr);
		ElfHeader.e_shentsize = sizeof(Elf_Shdr);
		}

		template <class ELFT> struct OutputSection {
		using Elf_Shdr = typename ELFT::Shdr;
		std::string Name;
		Lazy<Elf_Shdr> Shdr;
		Lazy<uint64_t> Addr;
		Lazy<uint64_t> Offset;
		Lazy<uint64_t> Size;
		Lazy<uint64_t> Align;
		uint32_t Index;
		bool NoBits = true;
		};

		template <class T, class ELFT>
		struct ContentSection : public OutputSection<ELFT> {
		Lazy<T> Content;
		ContentSection() { this->NoBits = false; }
		};

		class ELFStringTableBuilder : public StringTableBuilder {
		public:
		ELFStringTableBuilder() : StringTableBuilder(StringTableBuilder::ELF) {}
		};

		template <class ELFT> struct Symbols {
		using Elf_Sym = typename ELFT::Sym;
		std::vector<Lazy<Elf_Sym>> Symbols;
		uint64_t MaxAlign = 0;
		uint64_t MaxAddr = 0;
		};

		template <class ELFT> class ELFBuilder {
		public:
		using Elf_Ehdr = typename ELFT::Ehdr;
		using Elf_Shdr = typename ELFT::Shdr;
		using Elf_Phdr = typename ELFT::Phdr;
		using Elf_Sym = typename ELFT::Sym;
		using Elf_Addr = typename ELFT::Addr;
		using Elf_Dyn = typename ELFT::Dyn;

		private:
		Lazy<Elf_Ehdr> ElfHeader;
		ContentSection<ELFStringTableBuilder, ELFT> StrTab;
		ContentSection<Symbols<ELFT>, ELFT> DynSym;
		OutputSection<ELFT> DefSec;
		ContentSection<std::vector<Elf_Dyn>, ELFT> Dynamic;
		std::vector<Lazy<Elf_Phdr>> ProgramHeaders;

		template <class T> static void Write(uint8_t *Data, const T &Value) {
		reinterpret_cast<T >(Data) = Value;
		}
		template <class T>
		static void WriteLazyVector(uint8_t *Data, const std::vector<Lazy<T>> &Vec) {
		T Iter = reinterpret_cast<T >(Data);
		MaskRayUnsubmitted Not Done Reply Inline Actions `T` -> `T ` MaskRay: `T` -> `T `
		jakehehrlichAuthorUnsubmitted Done Reply Inline Actions Yeah there are a bunch of formatting issues. I'll run clang-format on these when I upload next. jakehehrlich: Yeah there are a bunch of formatting issues. I'll run clang-format on these when I upload next.
		for (const auto &Value : Vec) {
		Iter++ = Value;
		}
		}
		template <class T>
		static void WriteVector(uint8_t *Data, const std::vector<T> &Vec) {
		std::copy(Vec.begin(), Vec.end(), reinterpret_cast<T *>(Data));
		}
		uint64_t ShdrOffset(const OutputSection<ELFT> &Sec) const {
		return ElfHeader->e_shoff + Sec.Index * sizeof(Elf_Shdr);
		}
		void WriteShdr(uint8_t *Data, const OutputSection<ELFT> &Sec) const {
		Write(Data + ShdrOffset(Sec), *Sec.Shdr);
		}

		public:
		ELFBuilder(const ELFBuilder &) = delete;
		ELFBuilder(ELFBuilder &&) = default;
		explicit ELFBuilder(const ELFStub &Stub) {
		std::vector<OutputSection<ELFT> *> Sections;
		Sections.push_back(&StrTab);
		Sections.push_back(&DynSym);
		Sections.push_back(&Dynamic);
		Sections.push_back(&DefSec);

		const OutputSection<ELFT> *LastSection = Sections.back();
		ElfHeader = [this, &Stub, LastSection](Elf_Ehdr &Ehdr) {
		initELFHeader<ELFT>(Ehdr, Stub.Arch);
		Ehdr.e_shstrndx = StrTab.Index;
		Ehdr.e_shnum = LastSection->Index + 1;
		Ehdr.e_phnum = ProgramHeaders.size();
		Ehdr.e_phoff = sizeof(Elf_Ehdr);
		if (LastSection->NoBits)
		Ehdr.e_shoff = alignTo(*LastSection->Offset, sizeof(Elf_Addr));
		else
		Ehdr.e_shoff = alignTo(LastSection->Offset + LastSection->Size,
		sizeof(Elf_Addr));
		};

		ProgramHeaders.emplace_back([this](Elf_Phdr &Load) {
		Load.p_type = PT_LOAD;
		Load.p_offset = 0x0;
		Load.p_vaddr = 0x0;
		Load.p_paddr = 0x0;
		Load.p_filesz = *DefSec.Offset;
		Load.p_memsz = DefSec.Addr + DefSec.Size;
		Load.p_flags = PF_R;
		Load.p_align = 0x1000;
		});
		ProgramHeaders.emplace_back([this](Elf_Phdr &Dyn) {
		Dyn.p_type = PT_DYNAMIC;
		Dyn.p_offset = *Dynamic.Offset;
		Dyn.p_vaddr = *Dynamic.Addr;
		Dyn.p_paddr = Dyn.p_vaddr;
		Dyn.p_filesz = *Dynamic.Size;
		Dyn.p_memsz = Dyn.p_filesz;
		Dyn.p_flags = PF_R;
		Dyn.p_align = *Dynamic.Align;
		});

		MaskRayUnsubmitted Not Done Reply Inline Actions What is this section used for? MaskRay: What is this section used for?
		jakehehrlichAuthorUnsubmitted Done Reply Inline Actions It's the section in which all defined non-TLS sections are defined. It's synthesized as a NOBITS section to avoid using any space. jakehehrlich: It's the section in which all defined non-TLS sections are defined. It's synthesized as a…
		// Manually set the indexes of these.
		uint64_t Index = 1;
		const OutputSection<ELFT> *Prev = nullptr;
		uint64_t StartOffset =
		sizeof(Elf_Ehdr) + sizeof(Elf_Phdr) * ProgramHeaders.size();
		// Now set the Index, Offset, and Addr of everything.
		for (auto Sec : Sections) {
		Sec->Index = Index++;
		Sec->Offset = [=](uint64_t &Offset) {
		uint64_t Align = Sec->Align ? Sec->Align : 1;
		// Don't align the offset of a NOBITS section.
		if (Sec->NoBits)
		Align = 1;
		if (Prev == nullptr)
		Offset = alignTo(StartOffset, Align);
		else if (Prev->NoBits)
		MaskRayUnsubmitted Not Done Reply Inline Actions Which architecture uses the section name `.tls`? Did you mean `.tbss`? MaskRay: Which architecture uses the section name `.tls`? Did you mean `.tbss`?
		jakehehrlichAuthorUnsubmitted Done Reply Inline Actions .tbss would imply that this is writable when it isn't. In practice read-only data is already thread safe so there is no need for a read only TLS section. There is also in general no reason to have a read-only NOBITS section since the content is just zeros. This means there aren't good names to use for what I'm doing here. These section names don't matter functionally however. I'm open to many other names. I just went for short names where a name used in the wild didn't exist. jakehehrlich: .tbss would imply that this is writable when it isn't. In practice read-only data is already…
		MaskRayUnsubmitted Not Done Reply Inline Actions In practice read-only data is already thread safe so there is no need for a read only TLS section. So why do you need the non-writable `.tss`? MaskRay: > In practice read-only data is already thread safe so there is no need for a read only TLS…
		jakehehrlichAuthorUnsubmitted Done Reply Inline Actions I don't I don't really need it, but I want to minimize size and complexity. The requirements that I found (which might be minimal) You must have a .dynamic section if you have a soname, or dt_needed. Right now this is added even if no soname or needed libraries exist. To test with, inspect, and work with a file with a .dynamic section you need a PT_DYNAMIC segment To have a PT_DYNAMIC segment you need a PT_LOAD which covers it. Permissions on either of these seem not to matter If you have TLS symbols defined in your module you need a TLS section. Using a TLS nobits saves space I have not observed that a TLS segment is needed but I haven't really gotten to testing that out (I will remove it if it isn't needed). I'd like to have only one PT_LOAD that covers everything, not just the PT_DYNAMIC segment (though that is an option). That means .dynstr and .dynsym as well as .dynamic, the TLS section, and the other definition section. For aesthetic reasons I think the permissions on the PT_LOAD should match the permissions on the sections that it covers. Consequently all sections should have the same permissions if we want only one PT_LOAD. The TLS section is arguably not even covered by the PT_LOAD so it might be fair to give it different permissions and call it .tbss but this doesn't give us a proper name for the read only NOBITS section that critically does need to be covered by the PT_LOAD segment. So its an issue of aesthetic tradeoffs (cough bikeshed cough) in the face of some other motivated issues (having one PT_LOAD and using SHT_NOBITS where possible). There are a few worlds I think exist that try their best to meet the aesthetics work out nicely. Make the PT_LOAD writable and thus .dynsym and .dynstr as well (note that .dynamic is sometimes read only and sometimes writeable) Make the PT_LOAD read only and have read-only NOBITS sections in both TLS and non-TLS varities (the current choice) but not have good names and the idea of these existing is otherwise senseless and thus no previous examples of these exist Make the PT_LOAD read only, keep the non-TLS read-only NOBITS, but make the TLS section writeable so that we can call it ".tbss" in an aesthetically pleasing way. Make the PT_LOAD read only, keep the sections read-only, and call them ".rodata" and ".tbss" anyhow even if those things each imply (by convention) a false fact about our sections. As an adendum we could choose slightly different names as well. By choosing new names the aesthetic I violate is that I'm just making up section names (which happens a lot). I think anything else violates something I personally consider more fundamental like consistency, permissions consistency, using standard permissions for the standard sections we do have, etc... I think there's also an argument to be made that using differing section names makes these sorts of binaries easily recognizable. jakehehrlich: I don't I don't really need it, but I want to minimize size and complexity. The requirements…
		Offset = alignTo(*Prev->Offset, Align);
		else
		Offset = alignTo(Prev->Offset + Prev->Size, Align);
		};
		Sec->Addr = [=](uint64_t &Addr) {
		uint64_t Align = Sec->Align ? Sec->Align : 1;
		if (Prev == nullptr)
		Addr = alignTo(StartOffset, Align);
		else
		Addr = alignTo(Prev->Addr + Prev->Size, Align);
		};
		Prev = Sec;
		}

		DefSec.Name = ".def";
		DefSec.Shdr = [this](Elf_Shdr &Shdr) {
		Shdr.sh_name = StrTab.Content->getOffset(DefSec.Name);
		Shdr.sh_type = SHT_NOBITS;
		Shdr.sh_flags = SHF_ALLOC;
		Shdr.sh_size = *DefSec.Size;
		Shdr.sh_addralign = *DefSec.Align;
		Shdr.sh_addr = *DefSec.Addr;
		Shdr.sh_offset = *DefSec.Offset;
		};
		DefSec.Size = [this](uint64_t &Size) { Size = DynSym.Content->MaxAddr; };
		DefSec.Align = [this](uint64_t &Align) {
		Align = DynSym.Content->MaxAlign;
		};

		// Define the string table.
		StrTab.Name = ".dynstr";
		StrTab.Shdr = [this](Elf_Shdr &Shdr) {
		Shdr.sh_name = StrTab.Content->getOffset(StrTab.Name);
		Shdr.sh_type = SHT_STRTAB;
		Shdr.sh_flags = SHF_ALLOC;
		Shdr.sh_size = *StrTab.Size;
		Shdr.sh_addr = *StrTab.Addr;
		Shdr.sh_offset = *StrTab.Offset;
		};
		StrTab.Size = [this](uint64_t &Size) { Size = StrTab.Content->getSize(); };
		StrTab.Align = 0;
		StrTab.Content = [this, &Stub](ELFStringTableBuilder &Builder) {
		Builder.add(StrTab.Name);
		Builder.add(DynSym.Name);
		Builder.add(DefSec.Name);
		Builder.add(Dynamic.Name);
		for (const auto &Sym : Stub.Symbols)
		Builder.add(Sym.Name);
		if (Stub.SoName)
		Builder.add(*Stub.SoName);
		for (const auto &Needed : Stub.NeededLibs)
		Builder.add(Needed);
		Builder.finalize();
		};

		// Define the symbol table.
		DynSym.Name = ".dynsym";
		DynSym.Shdr = [this](Elf_Shdr &Shdr) {
		Shdr.sh_name = StrTab.Content->getOffset(DynSym.Name);
		Shdr.sh_type = SHT_DYNSYM;
		Shdr.sh_flags = SHF_ALLOC;
		Shdr.sh_size = *DynSym.Size;
		Shdr.sh_addralign = sizeof(Elf_Addr);
		Shdr.sh_addr = *DynSym.Addr;
		Shdr.sh_offset = *DynSym.Offset;
		Shdr.sh_link = StrTab.Index;
		Shdr.sh_entsize = sizeof(Elf_Sym);
		};
		DynSym.Align = sizeof(Elf_Addr);
		// Make sure to account for the null symbol.
		DynSym.Size = (Stub.Symbols.size() + 1) * sizeof(Elf_Sym);
		DynSym.Content = [this, &Stub](Symbols<ELFT> &SymbolInfo) {
		uint64_t Addr = 0x0;
		uint64_t MaxAlign = 0x0;
		// Make sure to add the null symbol.
		SymbolInfo.Symbols.emplace_back([](Elf_Sym &Sym) {
		memset(&Sym, 0, sizeof(Sym));
		});
		for (const auto &Symbol : Stub.Symbols) {
		Elf_Sym ElfSym;
		memset(&ElfSym, 0, sizeof(ElfSym));
		ElfSym.st_name = StrTab.Content->getOffset(Symbol.Name);
		ElfSym.st_size = Symbol.Size;
		if (!Symbol.Undefined) {
		switch (Symbol.Type) {
		case ELFSymbolType::Func:
		ElfSym.st_shndx = DefSec.Index;
		ElfSym.st_value = Addr++;
		break;
		case ELFSymbolType::Object:
		case ELFSymbolType::TLS:
		ElfSym.st_shndx = DefSec.Index;
		// Make sure we use the minimal valid alignment. The below aligns
		// Addr + Align to twice the alignment required which ensures that
		// after we subtract the alignment will be minimal. If just bumping
		// the Addr to Symbol.Alignment would have given us minimal
		// alignment then so will this expression. If it wouldn't then this
		// expression puts us only Symbol.Alignment further along which is
		// optimal.
		outs() << "\n" << Addr << " " << Symbol.Alignment << "\n";
		Addr = alignTo(Addr + Symbol.Alignment, Symbol.Alignment ? 2*Symbol.Alignment : 1) - Symbol.Alignment;
		ElfSym.st_value = Addr;
		Addr += Symbol.Size;
		MaxAlign = std::max(Symbol.Alignment, MaxAlign);
		break;
		default:
		break;
		}
		}
		ElfSym.setType(static_cast<unsigned char>(Symbol.Type));
		if (Symbol.Weak)
		ElfSym.setBinding(STB_WEAK);
		else
		ElfSym.setBinding(STB_GLOBAL);
		// Add in the final address back in.
		// TODO: Consider writing directly into output so that each Elf_Sym is
		// once and not twice.
		SymbolInfo.Symbols.emplace_back([this, ElfSym](Elf_Sym &Sym) {
		Sym = ElfSym;
		if (Sym.st_shndx != 0)
		Sym.st_value += *DefSec.Addr;
		});
		}
		SymbolInfo.MaxAddr = Addr;
		SymbolInfo.MaxAlign = MaxAlign;
		};

		// Construct the .dynamic table.
		Dynamic.Name = ".dynamic";
		Dynamic.Shdr = [this](Elf_Shdr &Shdr) {
		Shdr.sh_name = StrTab.Content->getOffset(Dynamic.Name);
		Shdr.sh_type = SHT_DYNAMIC;
		Shdr.sh_flags = SHF_ALLOC;
		Shdr.sh_addr = *Dynamic.Addr;
		Shdr.sh_offset = *Dynamic.Offset;
		Shdr.sh_size = *Dynamic.Size;
		Shdr.sh_link = StrTab.Index;
		Shdr.sh_addralign = sizeof(Elf_Addr);
		Shdr.sh_entsize = sizeof(Elf_Dyn);
		};
		Dynamic.Align = sizeof(Elf_Addr);
		Dynamic.Size = [this](uint64_t &Size) {
		// TODO: This can be calculated without knowing the Content.
		Size = Dynamic.Content->size() * sizeof(Elf_Dyn);
		};
		Dynamic.Content = [this, &Stub](std::vector<Elf_Dyn> &Entries) {
		auto Add = [&Entries](uint16_t Tag) -> Elf_Dyn & {
		Entries.emplace_back();
		Elf_Dyn &Dyn = Entries.back();
		Dyn.d_tag = Tag;
		return Dyn;
		};
		if (Stub.SoName)
		Add(DT_SONAME).d_un.d_val = StrTab.Content->getOffset(*Stub.SoName);
		for (const auto &Needed : Stub.NeededLibs)
		Add(DT_NEEDED).d_un.d_val = StrTab.Content->getOffset(Needed);
		Add(DT_STRTAB).d_un.d_ptr = *StrTab.Addr;
		Add(DT_STRSZ).d_un.d_ptr = *StrTab.Size;
		Add(DT_SYMTAB).d_un.d_ptr = *DynSym.Addr;
		Add(DT_SYMENT).d_un.d_ptr = sizeof(Elf_Sym);
		// TODO: For compaitability a hash table would be useful. In particular
		// llvm-elfabi currently only reads symbols from a hash table so it
		// can't read its own output. eu-elflint also complains about this issue.
		Add(DT_NULL);
		};
		}

		// GetSize will effectivelly compute the whole layout.
		size_t GetSize() const {
		return ElfHeader->e_shoff + ElfHeader->e_shnum * sizeof(Elf_Shdr);
		}

		void Write(uint8_t *Data) const {
		Write(Data, *ElfHeader);
		StrTab.Content->write(Data + StrTab.Shdr->sh_offset);
		WriteLazyVector(Data + sizeof(Elf_Ehdr), ProgramHeaders);
		WriteLazyVector(Data + DynSym.Shdr->sh_offset, DynSym.Content->Symbols);
		WriteVector(Data + Dynamic.Shdr->sh_offset, *Dynamic.Content);
		WriteShdr(Data, StrTab);
		WriteShdr(Data, DynSym);
		WriteShdr(Data, DefSec);
		WriteShdr(Data, Dynamic);
		}
		};

		/// This function opens a file for writing and then writes a binary ELF stub to
		/// the file.
		///
		/// @param FilePath File path for writing the ELF binary.
		/// @param Stub Source ELFStub to generate a binary ELF stub from.
		template <class ELFT>
		Error writeELFBinaryToFile(StringRef FilePath, const ELFStub &Stub) {
		ELFBuilder<ELFT> Builder{Stub};
		Expected<std::unique_ptr<FileOutputBuffer>> BufOrError =
		FileOutputBuffer::create(FilePath, Builder.GetSize());
		if (!BufOrError) {
		Error FileReadError = BufOrError.takeError();
		std::string Message;
		raw_string_ostream Stream(Message);
		Stream << FileReadError;
		Stream << " when trying to open `" << FilePath << "` for writing";
		consumeError(std::move(FileReadError));
		return createStringError(errc::invalid_argument, Stream.str().c_str());
		}

		// Write binary to file.
		std::unique_ptr<FileOutputBuffer> Buf = std::move(*BufOrError);
		Builder.Write(Buf->getBufferStart());

		if (Error FileWriteError = Buf->commit())
		return FileWriteError;

		return Error::success();
		}

		} // end namespace

		namespace llvm {
		namespace elfabi {

		// This function wraps the ELFT writeELFBinaryToFile() so writeBinaryStub()
		// can be called without having to use ELFType templates directly.
		Error writeBinaryStub(StringRef FilePath, const ELFStub &Stub,
		ELFTarget OutputFormat) {
		if (OutputFormat == ELFTarget::ELF32LE) {
		return writeELFBinaryToFile<ELF32LE>(FilePath, Stub);
		} else if (OutputFormat == ELFTarget::ELF32BE) {
		return writeELFBinaryToFile<ELF32BE>(FilePath, Stub);
		} else if (OutputFormat == ELFTarget::ELF64LE) {
		return writeELFBinaryToFile<ELF64LE>(FilePath, Stub);
		} else if (OutputFormat == ELFTarget::ELF64BE) {
		return writeELFBinaryToFile<ELF64BE>(FilePath, Stub);
		}
		return createStringError(errc::invalid_argument,
		"Invalid binary output target");
		}

Expected<std::unique_ptr<ELFStub>> readELFFile(MemoryBufferRef Buf) {		Expected<std::unique_ptr<ELFStub>> readELFFile(MemoryBufferRef Buf) {
Expected<std::unique_ptr<Binary>> BinOrErr = createBinary(Buf);		Expected<std::unique_ptr<Binary>> BinOrErr = createBinary(Buf);
if (!BinOrErr) {		if (!BinOrErr) {
return BinOrErr.takeError();		return BinOrErr.takeError();
}		}

Binary *Bin = BinOrErr->get();		Binary *Bin = BinOrErr->get();
if (auto Obj = dyn_cast<ELFObjectFile<ELF32LE>>(Bin)) {		if (auto Obj = dyn_cast<ELFObjectFile<ELF32LE>>(Bin)) {
Show All 14 Lines

llvm/tools/llvm-elfabi/llvm-elfabi.cpp

Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
cl::opt<std::string>		cl::opt<std::string>
EmitTBE("emit-tbe",		EmitTBE("emit-tbe",
cl::desc("Emit a text-based ELF stub (.tbe) from the input file"),		cl::desc("Emit a text-based ELF stub (.tbe) from the input file"),
cl::value_desc("path"));		cl::value_desc("path"));
cl::opt<std::string> SOName(		cl::opt<std::string> SOName(
"soname",		"soname",
cl::desc("Manually set the DT_SONAME entry of any emitted files"),		cl::desc("Manually set the DT_SONAME entry of any emitted files"),
cl::value_desc("name"));		cl::value_desc("name"));
		cl::opt<ELFTarget> BinaryOutputTarget(
		"output-target", cl::desc("Create a binary stub for the specified target"),
		cl::values(clEnumValN(ELFTarget::ELF32LE, "elf32-little",
		"32-bit little-endian ELF stub"),
		clEnumValN(ELFTarget::ELF32BE, "elf32-big",
		"32-bit big-endian ELF stub"),
		clEnumValN(ELFTarget::ELF64LE, "elf64-little",
		"64-bit little-endian ELF stub"),
		clEnumValN(ELFTarget::ELF64BE, "elf64-big",
		"64-bit big-endian ELF stub")));
		cl::opt<std::string> BinaryOutputFilePath(cl::Positional, cl::desc("output"));

/// writeTBE() writes a Text-Based ELF stub to a file using the latest version		/// writeTBE() writes a Text-Based ELF stub to a file using the latest version
/// of the YAML parser.		/// of the YAML parser.
static Error writeTBE(StringRef FilePath, ELFStub &Stub) {		static Error writeTBE(StringRef FilePath, ELFStub &Stub) {
std::error_code SysErr;		std::error_code SysErr;

// Open file for writing.		// Open file for writing.
raw_fd_ostream Out(FilePath, SysErr);		raw_fd_ostream Out(FilePath, SysErr);
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	int main(int argc, char *argv[]) {
if (!StubOrErr) {		if (!StubOrErr) {
Error ReadError = StubOrErr.takeError();		Error ReadError = StubOrErr.takeError();
WithColor::error() << ReadError << "\n";		WithColor::error() << ReadError << "\n";
exit(1);		exit(1);
}		}

std::unique_ptr<ELFStub> TargetStub = std::move(StubOrErr.get());		std::unique_ptr<ELFStub> TargetStub = std::move(StubOrErr.get());

// Write out .tbe file.		// Change SoName before emitting stubs.
if (EmitTBE.getNumOccurrences() == 1) {
TargetStub->TbeVersion = TBEVersionCurrent;
if (SOName.getNumOccurrences() == 1) {		if (SOName.getNumOccurrences() == 1) {
TargetStub->SoName = SOName;		TargetStub->SoName = SOName;
}		}

		// Write out .tbe file.
		if (EmitTBE.getNumOccurrences() == 1) {
		TargetStub->TbeVersion = TBEVersionCurrent;
Error TBEWriteError = writeTBE(EmitTBE, *TargetStub);		Error TBEWriteError = writeTBE(EmitTBE, *TargetStub);
if (TBEWriteError) {		if (TBEWriteError) {
WithColor::error() << TBEWriteError << "\n";		WithColor::error() << TBEWriteError << "\n";
exit(1);		exit(1);
}		}
}		}

		// Write out binary ELF stub.
		if (BinaryOutputFilePath.getNumOccurrences() == 1) {
		if (BinaryOutputTarget.getNumOccurrences() == 0) {
		WithColor::error() << "No binary output target specified.\n";
		exit(1);
		}
		Error BinaryWriteError = writeBinaryStub(BinaryOutputFilePath, *TargetStub,
		BinaryOutputTarget);
		if (BinaryWriteError) {
		WithColor::error() << BinaryWriteError << "\n";
		exit(1);
		}
		}
}		}

This is an archive of the discontinued LLVM Phabricator instance.

[elfabi] Write program headers, .dynamic, .dynstr, and .shstrtabNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 198111

llvm/include/llvm/TextAPI/ELF/ELFStub.h

llvm/lib/TextAPI/ELF/TBEHandler.cpp

llvm/test/tools/llvm-elfabi/binary-write-neededlibs.test

llvm/test/tools/llvm-elfabi/binary-write-pheaders.test

llvm/test/tools/llvm-elfabi/binary-write-sheaders.test

llvm/test/tools/llvm-elfabi/binary-write-soname.test

llvm/test/tools/llvm-elfabi/binary-write-symbols.test

llvm/test/tools/llvm-elfabi/invalid-bin-target.test

llvm/test/tools/llvm-elfabi/missing-bin-target.test

llvm/test/tools/llvm-elfabi/write-elf32be-ehdr.test

llvm/test/tools/llvm-elfabi/write-elf32le-ehdr.test

llvm/test/tools/llvm-elfabi/write-elf64be-ehdr.test

llvm/test/tools/llvm-elfabi/write-elf64le-ehdr.test

llvm/tools/llvm-elfabi/ELFObjHandler.h

llvm/tools/llvm-elfabi/ELFObjHandler.cpp

llvm/tools/llvm-elfabi/llvm-elfabi.cpp

[elfabi] Write program headers, .dynamic, .dynstr, and .shstrtab
Needs ReviewPublic