This is an archive of the discontinued LLVM Phabricator instance.

[llvm-elfabi] Emit ELF .dynsym and .dynamic sections
ClosedPublic

Authored by haowei on Oct 14 2020, 4:04 PM.

Details

Summary

This change makes llvm-elfabi tool to generate a minimal stub ELF shared library just for linking purpose from a tbe file. Tested with lld.

Diff Detail

Event Timeline

haowei created this revision.Oct 14 2020, 4:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 14 2020, 4:04 PM
haowei requested review of this revision.Oct 14 2020, 4:04 PM
mcgrathr accepted this revision.Oct 23 2020, 10:43 AM

LGTM. The section ordering details are conventional but not supposed to be semantically meaningful anyway.

llvm/lib/InterfaceStub/ELFObjHandler.cpp
115

Seems safer to use s{}.

128

Making this S{} both obviates the need to set any values explicitly to zero and also ensures no field was forgotten.

143

This could be a single memcpy call using Symbols.data().

203

This could be a single memcpy call using Entries.data().

209

Just constexpr Elf_Dyn{} is enough here. You could also just memset directly instead of copying some zero bytes with memcpy.

249

No other flags are expected for SHT_DYNSYM or SHT_SYMTAB.

263

This need not set SHF_WRITE. e.g. Fuchsia uses -z rodynamic and some other platforms use R/O .dynamic sections. The writability does not matter at link time.

289

sizeof(Elf_Addr) seems clearest here and in other such cases.

303

Traditionally the ordering is to put all SHF_ALLOC sections before all others. .shstrtab in particular is usually the last section.

305

Traditional ordering between these is: .dynsym, .dynstr, .dynamic.

This revision is now accepted and ready to land.Oct 23 2020, 10:43 AM

Nice. Will you folks be updating llvm-ifs to use this support instead of yaml2obj soon? If so add me to the review, I will be happy to take a look.

MaskRay requested changes to this revision.Oct 23 2020, 10:56 AM

There appears to multiple review comments from @mcgrathr so I think "Request Changes" is the suitable action for now.

llvm/lib/InterfaceStub/ELFObjHandler.cpp
131

Remove the TODO. 0 is safe.

This revision now requires changes to proceed.Oct 23 2020, 10:56 AM

Nice. Will you folks be updating llvm-ifs to use this support instead of yaml2obj soon? If so add me to the review, I will be happy to take a look.

That's the plan. When the InterfaceStub library is ready, which should be soon hopefully, we'll integrate it into llvm-ifs and remove llvm-elfabi.

MaskRay added inline comments.Oct 23 2020, 11:00 AM
llvm/lib/InterfaceStub/ELFObjHandler.cpp
297
320

Capitalize the comment and end with a period.

https://llvm.org/docs/CodingStandards.html#commenting

330

ditto

334

ditto

389

Usually linkers place .dynsym .dynstr before .strtab .shstrtab .symtab

The source order does not reflect the actual order and can cause confusion

llvm/test/tools/llvm-elfabi/binary-write-sheaders.test
66

The section order should be changed. See the comment above.

haowei updated this revision to Diff 300414.Oct 23 2020, 2:47 PM

Address review comments.

haowei marked 17 inline comments as done.Oct 23 2020, 2:53 PM
haowei added inline comments.
llvm/lib/InterfaceStub/ELFObjHandler.cpp
389

I have reordered the section headers. The StrTab here is actually .dynstr since a stub SO does not need .strtab .

grimar added a subscriber: grimar.Oct 25 2020, 11:36 PM
grimar added inline comments.
llvm/lib/InterfaceStub/ELFObjHandler.cpp
113

You can avoid having 2 public: areas by merging them.

195

If you add DT_NULL explicitly in the code, then you can simplify these 2 methods to:

size_t getSize() const { return Entries.size() * sizeof(Elf_Dyn); }
 
void write(uint8_t *Buf) const { memcpy(Buf, Entries.data(), getSize()); }
grimar added inline comments.Oct 25 2020, 11:43 PM
llvm/lib/InterfaceStub/ELFObjHandler.cpp
183

It is not clear why these e methods are needed? You could just call addValue on the caller side.

haowei updated this revision to Diff 300800.Oct 26 2020, 2:57 PM
haowei marked an inline comment as done.

address review comments.

haowei added inline comments.Oct 26 2020, 2:58 PM
llvm/lib/InterfaceStub/ELFObjHandler.cpp
113

The private section needs the "Elf_Sym" type alias declared in the first public section and the second public section requires the private "Symbols" variable. So I don't think I can merge the two public sections as it is a chicken and egg issue.

183

I feel these methods are better for readability. I removed those in latest patch.

195

I prefer to keep "add DT_NULL" in the builder class instead of explicitly in the code. It reduces the chance a maintainer add new entries after DT_NULL entry in the future.

grimar added inline comments.Oct 26 2020, 11:10 PM
llvm/lib/InterfaceStub/ELFObjHandler.cpp
113

There is no chicken and egg issue here. C++ doesn't work like that.
You don`t have ro declare the Symbols member before using it in member functions.
So, you can do the following:

template <class ELFT> class ELFSymbolTableBuilder {
public:
  using Elf_Sym = typename ELFT::Sym;

  ELFSymbolTableBuilder() {
    Elf_Sym S{};
    Symbols.push_back(S);
  }

  void add(ELFStringTableBuilder &StrTabBuilder, StringRef StName,
           uint64_t StSize, uint8_t StBind, uint8_t StType, uint8_t StOther,
           uint16_t StShndx) {
    Elf_Sym S{};
    S.st_name = StrTabBuilder.getOffset(StName);
    S.st_size = StSize;
    S.st_info = (StBind << 4) | (StType & 0xf);
    S.st_other = StOther;
    S.st_shndx = StShndx;
    Symbols.push_back(S);
  }

  size_t getSize() const { return Symbols.size() * sizeof(Elf_Sym); }

  void write(uint8_t *Buf) const {
    memcpy(Buf, Symbols.data(), sizeof(Elf_Sym) * Symbols.size());
  }

private:
  llvm::SmallVector<Elf_Sym, 64> Symbols;
};
119

Instead of passing both StrTabBuilder abd StName you can just pass an offset.

195

I think, this is a design issue. With this approach Entries doesn't contain all entries written.
DT_NULL is a terminator entry, but it is still an entry.

Though, if you want go this way, then you should be able to use memset:
memset(Buf + sizeof(Elf_Dyn) * Entries.size(), 0, sizeof(Elf_Dyn));

instead of

constexpr Elf_Dyn Terminator{};
memcpy(Buf + sizeof(Elf_Dyn) * Entries.size(), &Terminator,
       sizeof(Elf_Dyn));

Also, I was unable to apply this patch. Seems it is rebased on a different one. In this way you probably might want to add a parent revision,
so that dependencies can be seen.

MaskRay added inline comments.Oct 26 2020, 11:22 PM
llvm/lib/InterfaceStub/ELFObjHandler.cpp
111

64 is too large. This costs 24*64 bytes to the class. 0 may be fine assuming you always need a dynamic allocation (which is not a big deal)

331

Linkers don't read DT_SYMTAB or DT_STRTAB.

haowei updated this revision to Diff 301101.Oct 27 2020, 2:25 PM

Address review comments

haowei marked 2 inline comments as done.Oct 27 2020, 2:36 PM
haowei added inline comments.
llvm/lib/InterfaceStub/ELFObjHandler.cpp
111

I changed the value to 8, similar to other SmallVector used in this file. I think it should be small enough now.

113

Thanks for pointing out. I have merged 2 public sections.

119

Done.

195

I changed it to memset.

331

I tried locally to remove DT_SYMTAB and DT_STRTAB and confirm it does not affect linking. But I would prefer to keep these 2 entries since the generated Stub SO file does have .dynstr and .dynsym sections. There might be other tools would look for these references in .dynamic table and it would yield unexpected results if they are removed.

haowei marked 2 inline comments as done.Oct 27 2020, 2:38 PM

Also, I was unable to apply this patch. Seems it is rebased on a different one. In this way you probably might want to add a parent revision,
so that dependencies can be seen.

Sorry about that. I thought the attachment to the parent revision would be automatic. I have manually attached the parent revision to this diff.

haowei updated this revision to Diff 301153.Oct 27 2020, 6:07 PM

I have no more comments (added a minor nit).

I thought the attachment to the parent revision would be automatic. I have manually attached the parent revision to this diff.

It can be automatic if you add the following line to the patch description when posting:
"Depends on D<patchid>". E.g. "Depends on D61767" in this case.

llvm/lib/InterfaceStub/ELFObjHandler.cpp
112

nit: you should be able to do:

Symbols.push_back({});
haowei updated this revision to Diff 301362.Oct 28 2020, 11:58 AM

address review comments

haowei marked an inline comment as done.Oct 28 2020, 11:58 AM
MaskRay edited reviewers, added: grimar; removed: MaskRay.Nov 5 2020, 2:22 PM
MaskRay edited subscribers, added: MaskRay; removed: grimar.
This revision is now accepted and ready to land.Nov 5 2020, 2:22 PM

Please let @grimar approve:)

After revisiting this patch I have some more questions :]

llvm/lib/InterfaceStub/ELFObjHandler.cpp
150

Can't this be the following?

void modifyAddr(size_t Index, uint64_t Addr) {
  Entries[Index].d_un.d_ptr = Addr;
}

(the same below)

325

I am not sure what is happening here. Could you explain?
Why a symbol is attached to .text and why it is assumed that .text has index 1?
Also, where it is tested?

haowei updated this revision to Diff 303576.Nov 6 2020, 3:45 PM
haowei marked an inline comment as done.

Address review concerns.

haowei added inline comments.Nov 6 2020, 3:49 PM
llvm/lib/InterfaceStub/ELFObjHandler.cpp
150

Fixed in latest diff.

325

This snippet adds symbols from the ELFStub structure into the .dynsym section builder. The "1" here is a placeholder. It does not mean ".text" has to be index 1. The issue here is for non-Undefined symbols, the shndx needs to be a value other than 0. Please correct me if I am wrong, at link time, for dynamic linking, the shndx value does not matter as long as it is not SHN_UNDEF for non-Undefined symbols, that why I put 1 here, which in current state, it points to ".dynsym" itself.

The TODO message means I would like to add a dummy place holder ".text" section and point symbols to it in the stub SO in case any tool need a valid shndx value. For linking with ld/lld, I don't think that is necessary.

I added section name in the binary-write-sheaders.test to verify shndx is filled with expected value. I do want to add some integration test to verify this elfabi tool will work with clang and lld, but it looks like I don't have access to the lld in the llvm test. If you have any good suggestion for integration testing, please let me know.

We did performed end to end testing with elfabi tool in Fuchsia build and the linked binary from stub SO were equivalent to the ones linked from original SO files. If we chose to preserve the order of symbol table, the build results were the same bit by bit.

haowei updated this revision to Diff 303578.Nov 6 2020, 3:50 PM
grimar added inline comments.Nov 8 2020, 11:57 PM
llvm/lib/InterfaceStub/ELFObjHandler.cpp
150

Not fixed. I suggested a one line method.

325

Please correct me if I am wrong, at link time, for dynamic linking, the shndx value does not matter as long as it is not SHN_UNDEF for non-Undefined symbols, that why I put 1 here, which in current state, it points to ".dynsym" itself.

I guess you are right. I can't imagine a good reason why shndx can be important in run time. Perhaps someone else who are more familar with the logic of existent dynamic loaders can confirm.

The TODO message means I would like to add a dummy place holder ".text" section and point symbols to it in the stub SO in case any tool need a valid shndx value.

I think you should use "point" word, not "points" then (in the TODO comment). Now the comment sounds like it already points to .text.

Also, I think it worth to update this comment to explain that the symbol index is not really important and you assign an arbitrary non-SNH_UNDEF value, which now points to .dynsym itself.

I am not sure how much is useful to mention ".text" here (i.e. how much usefull would be to add this section and TODO about that plan).
As far I understand there is no known tool that is known to not like the existent behavior.

I added section name in the binary-write-sheaders.test to verify shndx is filled with expected value.

I think it is enough. The behavior of LLD is tested separatelly.

haowei updated this revision to Diff 303948.Nov 9 2020, 11:41 AM
haowei marked an inline comment as done.
haowei added inline comments.
llvm/lib/InterfaceStub/ELFObjHandler.cpp
150

I see, I misunderstood your suggestion. It should be fixed in latest diff now.

325

I have updated the comment section and remove the TODO.

grimar accepted this revision.Nov 9 2020, 10:54 PM

LGTM I think.

This revision was landed with ongoing or failed builds.Nov 24 2020, 12:17 AM
This revision was automatically updated to reflect the committed changes.