Page MenuHomePhabricator

[llvm-elfabi] Emit ELF header and string table section
Needs ReviewPublic

Authored by jakehehrlich on May 9 2019, 3:54 PM.

Details

Summary

This change serves to create the initial framework for outputting ELF files from llvm-elfabi.

Sorry this is so large but this, perhaps short of the code for outputting the string table, is about as small of a setup as I can get and still have something testable. You can see the full picture here: https://reviews.llvm.org/D55864. Had I tried to get to exactly this point more directly and not by splitting it probably wouldn't be quite as large or generically written but if possible I'd like to keep some of the generality here since I'll likely add it back soon.

Diff Detail

Repository
rL LLVM

Event Timeline

jakehehrlich created this revision.May 9 2019, 3:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2019, 3:54 PM
MaskRay added inline comments.May 9 2019, 7:46 PM
llvm/test/tools/llvm-elfabi/write-elf64be-ehdr.test
2

I should copy my comment to your earlier revision here. If at this early stage you don't care about big-endian, delete it, or at least make it realistic, e.g. EM_PPC64.

jakehehrlich marked an inline comment as done.May 16 2019, 1:27 PM
jakehehrlich added inline comments.
llvm/test/tools/llvm-elfabi/write-elf64be-ehdr.test
2

I think we want to care about the fact that we're not doing anything ELFT specific. The architectures currently added only include X86_64 and AArch64 so without adding support for that first I can't add that. It's a really simple change it just seems orthogonal and I don't see the benefit. If you want I'll add power pc as a supported arch in this change as well ahd change the test accordingly.

Initial thoughts from me are the code looks good, but this needs more documentation overall. The approach is clever, but not immediately intuitive so I feel there should be in-code comments to explain it at a high level. Sort of like your descriptions in D55864, but in-code so they are visible to everyone reading it.

llvm/tools/llvm-elfabi/ELFObjHandler.cpp
48

Add some comments with how this is intended to be used, and how cycles are detected.

443–446

Carry-over from my change: use ELFMAG constants instead of hard-coded values.

483

I'm assuming this just exists to make templating easier. If so, add a quick one-line comment to mention that's why it exists.

504–514

Probably can omit these until they're needed.

530

I'd like to see some in-code documentation here as this is where the overall approach is first introduced. It's not immediately clear how this block of code integrates into the overall solution.

Perhaps start with a top-level description for what the constructor does/how it is done, then add a comment to this lambda that describes how it fits into the big picture.

This will make the code significantly more approachable for anyone else who may have to read this code in the future.

535

This should probably be 0 until program headers are added in a subsequent change.

593

This calculation doesn't look like it includes the string table.

jakehehrlich marked 6 inline comments as done.

Addressed Armando's comments

jakehehrlich added inline comments.Jun 10 2019, 1:03 PM
llvm/tools/llvm-elfabi/ELFObjHandler.cpp
483

Not exactly. I added a comment.

593

That's not the right way to think about it. Our layout is the following

ELF header
[ Program Headers ]
[ Section Contents ]
[ Section Headers ]

ElfHeader->e_shoff always points to wherever the section headers were laid out and ElfHeader->e_shnum tells us the total size.

Now that's all we have to think about and we don't and shouldn't think about anything else. To answer your question however it transitively depends on the size of the string table because the string table is a part of the Section Content fragment of our file and ElfHeader->e_shoff depends on the layout of that. The layout of that depends on the size of the string table which depends on the content of the string table.

Mostly looks good, some inline comments and questions.

Thanks!

-eric

llvm/include/llvm/BinaryFormat/ELF.h
60

This should at least be near the other magic version parts?

llvm/tools/llvm-elfabi/ELFObjHandler.cpp
32

What's with the change?

58

The name is somewhat inscrutable, perhaps something else?

194

The reformatting looks like it could be done separately?

436

There's already a lot of code in ELFWriter to do this, perhaps you could use it? Or at least support::endian::Writer? In that I see where you're going with this, but perhaps it could be done in a different way rather than redoing existing work (there are only so many times we need to have a new way of writing an elf header :)

644

llvm style is no braces for a single line.

jakehehrlich marked an inline comment as done.Jun 17 2019, 6:10 PM
jakehehrlich added inline comments.
llvm/tools/llvm-elfabi/ELFObjHandler.cpp
32

There was a whole bunch of stuff that was marked as 'static' here but the types and methods should all be "static" as well so I put them in an anonymous namespace and moved the important bits into the same nested namespace below.

58

Will do, how does "BuildNode" sound? Or "DelayedStep"? Any recommendations?

For those familiar with lazy evaluation this is an easy to understand name but "Thunk" would also be acceptable for that crowd.

194

Yeah I just applied it to the whole file and it had some unintended consequences. I'll fix it.

436

Can you be more specific about what you have in mind and where it would live? I'd prefer not to depend on MC if possible. I can think of like 4 places where this happens in llvm and all of it looks pretty similar but they all have subtle differences. Like here I hard code a lot of stuff. In llvm-objcopy we mostly copy that all over. In MC some things are generated and others are hard coded.

Only e_ident really contains anything that is universally hardcoded anywhere and even on aspect of that isn't hard coded. Everything else would need to be specified on a per use base if we wanted to abstract it away. But then what the user has to specify for the remainder is equivalent to what they're already doing. Do you perhaps just want an e_ident helper in the libObject ELF code?

@jakehehrlich Haven't seen an update on this diff in a while. How are things going?