This is an archive of the discontinued LLVM Phabricator instance.

[elfabi] Add support for reading dynamic symbols from binaries
ClosedPublic

Authored by amontanez on Dec 21 2018, 3:06 PM.

Details

Summary

This patch adds initial support for reading dynamic symbols from ELF binaries. Currently, STT_NOTYPE, STT_OBJECT, STT_FUNC, and STT_TLS are explicitly supported. Other symbol types are mapped to ELFSymbolType::Unknown to improve signal/noise ratio.

Symbols must meet two criteria to be read into in an ELFStub:

  • The symbol's binding must be STB_GLOBAL or STB_WEAK.
  • The symbol's visibility must be STV_DEFAULT or STV_PROTECTED.

This filters out symbols that aren't of interest during compile-time linking against a shared object.

This change uses DT_HASH and DT_GNU_HASH to determine the size of .dynsym. Using hash tables to determine the number of symbols in .dynsym allows llvm-elfabi to work on binaries without relying on section headers.

Diff Detail

Repository
rL LLVM

Event Timeline

amontanez created this revision.Dec 21 2018, 3:06 PM
amontanez edited the summary of this revision. (Show Details)
amontanez added a reviewer: ruiu.

Now uses DT_HASH and DT_GNU_HASH to determine the size of .dynsym. A number of tests still need to be updated as this change requires that a DT_SYMTAB entry exists in the .dynamic section (which many tests are missing).

jakehehrlich added inline comments.Jan 18 2019, 1:20 PM
llvm/tools/llvm-elfabi/ELFObjHandler.cpp
166 ↗(On Diff #182552)

Isn't this what values() is for?

195 ↗(On Diff #182552)

ELFT should be inferred here.

195 ↗(On Diff #182552)

i think rather than trying to be pedantic and get the largest index from both, you can just return right here.

198 ↗(On Diff #182552)

I think you might consider adding ElfHash in a second patch instead.

243 ↗(On Diff #182552)

Technically this gets a lot more complicated. You have a lot of special section indexes and there are there is this somewhat complicated thing for large symbol indexes that you don't need to implement now. You should check that the symbol index is below SHN_LORESERVE first. In final library form there is potential for this input from the disk to be wrong so you'll need to return an error anyhow.

Add a todo for the other things like common, abs, large indexes, etc...

269 ↗(On Diff #182552)

This should just be a range based for loop.

271 ↗(On Diff #182552)

There are macros to do this for I believe. Same with visibility.

amontanez updated this revision to Diff 182649.Jan 18 2019, 5:17 PM
amontanez marked 8 inline comments as done.

Addressed most of your comments, but I've left the ELF/SYSV hash table in since it amounts to 12 additional lines for now. I can pull it out later if that's really needed. I feel it's relevant and small enough to leave in.

amontanez added inline comments.Jan 22 2019, 9:44 AM
llvm/tools/llvm-elfabi/ELFObjHandler.cpp
195 ↗(On Diff #182552)

I'm not fully certain why, but ELFT can't be inferred in this situation.

243 ↗(On Diff #182552)

As per offline discussion, there's not really a case where any value of st_shndx other than SHN_UNDEF would make a symbol undefined.

amontanez updated this revision to Diff 182984.Jan 22 2019, 2:36 PM

Rebase on license changes, fix behavior for edge case when a .gnu.hash section has no buckets.

jakehehrlich accepted this revision.Jan 23 2019, 12:20 PM

Pending 1 nit, LGTM.

llvm/tools/llvm-elfabi/ELFObjHandler.cpp
273–275 ↗(On Diff #182984)

This would all be nicer if populateSymbol was part of a constructor for ELFSymbol and then you could used TargetStub.Symbols.emplace(*SymName, RawSym) instead of these 3 lines.

195 ↗(On Diff #182552)

I see why now, it is the "ELFT::..." You can think "::GnuHash" as acting like a function and the compiler would have to reverse that function. In fact there might not actually be a unique inverse.

There's an alternative, which is to use the underlying type that is paramterized by ELFT but this is *not* the intended surface of the ELFType library so you shouldn't do that. So this was correct as is I suppose.

This revision is now accepted and ready to land.Jan 23 2019, 12:20 PM
amontanez updated this revision to Diff 183219.Jan 23 2019, 5:22 PM
amontanez edited the summary of this revision. (Show Details)
amontanez added a reviewer: jhenderson.

Changed populateSymbol() to createELFSymbol().

amontanez marked 3 inline comments as done.Jan 23 2019, 5:26 PM
amontanez added inline comments.
llvm/tools/llvm-elfabi/ELFObjHandler.cpp
273–275 ↗(On Diff #182984)

Since we can't have TextAPI rely on libObject, we can't add a constructor for ELFSymbol that uses ELFT. I've changed populateSymbol() to createELFSymbol() so this portion looks slightly better.

amontanez updated this revision to Diff 183224.Jan 23 2019, 6:13 PM
amontanez marked an inline comment as done.

Added dummy DT_SYMTAB entries to tests so all tests pass again.

amontanez updated this revision to Diff 183404.Jan 24 2019, 2:36 PM

Fixed a comment that passed 80 character width limit.

This revision was automatically updated to reflect the committed changes.