This is an archive of the discontinued LLVM Phabricator instance.

[llvm-readelf] - Allow dumping the dynamic symbols when there is no program headers.
ClosedPublic

Authored by grimar on Sep 2 2019, 6:41 AM.

Details

Summary

D62179 introduced a regression. llvm-readelf lose the ability to dump the dynamic symbols
when there is .dynamic section with a DT_SYMTAB, but there are no program headers:
https://reviews.llvm.org/D62179#1652778

Below is a program flow before the D62179 change:

  1. Find SHT_DYNSYM.
  2. Find there is no PT_DYNAMIC => don't try to parse it.
  3. Print dynamic symbols using information about them found on step (1).

And after the change it became:

  1. Find SHT_DYNSYM.
  2. Find there is no PT_DYNAMIC => find SHT_DYNAMIC.
  3. Parse dynamic table, but fail to handle the DT_SYMTAB because of the absence of the PT_LOAD. Report the "Virtual address is not in any segment" error.

This patch fixes the issue. For doing this it checks that the value of DT_SYMTAB was mapped to a segment. If not - it ignores it.

Diff Detail

Repository
rL LLVM

Event Timeline

grimar created this revision.Sep 2 2019, 6:41 AM
grimar marked an inline comment as done.Sep 2 2019, 6:42 AM
grimar added inline comments.
tools/llvm-readobj/ELFDumper.cpp
1655 ↗(On Diff #218356)

for example when we want to dump the dynamic relocations and there is no sections table

This is used and tested by test\Object\relocation-executable.test

grimar edited the summary of this revision. (Show Details)Sep 2 2019, 6:43 AM
MaskRay added inline comments.Sep 2 2019, 9:26 AM
test/tools/llvm-readobj/dyn-symbols.test
185 ↗(On Diff #218356)

The test is different from https://reviews.llvm.org/D62179#1652778

In https://reviews.llvm.org/D62179#1652778, there is no program headers (corrupted phdr), so toMappedAddr returns a nullptr. Here, DT_SYMTAB is 0. When the information from DT_SYMTAB and .dynsym conflict, we probably should let DT_DYNTAB win...

tools/llvm-readobj/ELFDumper.cpp
1656 ↗(On Diff #218356)

We probably should check whether toMappedAddr returns a nullptr - which is the case if there is no program headers. This approach may be better than

if (!DynSymRegion.EntSize) {
grimar marked an inline comment as done.Sep 3 2019, 1:08 AM
grimar added inline comments.
test/tools/llvm-readobj/dyn-symbols.test
185 ↗(On Diff #218356)

The test is different from https://reviews.llvm.org/D62179#1652778

I think not. Here is the same YAML but with the program headers:

--- !ELF
FileHeader:
  Class:   ELFCLASS64
  Data:    ELFDATA2LSB
  Type:    ET_DYN
  Machine: EM_X86_64
Sections:
  - Name:    .dynamic
    Type:    SHT_DYNAMIC
    Entries:
      - Tag:   DT_SYMTAB
        Value: 0
      - Tag:   DT_NULL
        Value: 0
DynamicSymbols:
  - Name: foo
ProgramHeaders:
  - Type: PT_LOAD
    Flags: [ PF_R ]
    VAddr: 0x0000
    PAddr: 0x0000
    Align: 8
    Sections:
      - Section: .dynsym

The 0 value for DT_SYMTAB is valid. I just removed the ProgramHeaders.

When the information from DT_SYMTAB and .dynsym conflict, we probably should let DT_DYNTAB win...

I do not understand why. Particulary I do not understand why we should take .dynsym size from the section headers table
(it is the only source it seems), but its address from another place.
The much simpler logic is to try to take everything from section header at first and fallback to DT_SYMTAB. This is what this patch does.

If we have different results, then I guess it would be correct to report a warning, but I am not sure why DT_SYMTAB should win.
Doesn't it mean we have a broken object? In this case why should we make any assumptions about which place is correct and which is not?

MaskRay added inline comments.Sep 3 2019, 1:29 AM
test/tools/llvm-readobj/dyn-symbols.test
185 ↗(On Diff #218356)

There are two approaches to locate the dynamic symbol table:

  • program header -> PT_DYNAMIC -> DT_SYMTAB
  • section header -> .dynsym

The first approach is used by the final consumer of an executable/shared object: ld.so or some other interpreter/emulator. They locate the dynamic table (from a program header), parse it, and use DT_SYMTAB to resolve symbol lookups. They ignore the section header. Prioritizing the section header can potentially be cheated by a malicious binary.

// Information in the section header has priority over the information

// in a PT_DYNAMIC header.

So, I think we should probably prioritize the information in PT_DYNAMIC.

In this case, the program header does not exist. I think it is fine to ignore DT_SYMTAB in the first approach.

I don't have any particular strong feelings either way on the discussion in the comments, so I've just made a number of suggested minor comment fixes.

test/tools/llvm-readobj/dyn-symbols.test
166 ↗(On Diff #218356)

dump dynamic symbols table -> dump the dynamic symbol table

192 ↗(On Diff #218356)

to address -> to an address
of absence -> of the absence
of the corresponding -> of a corresponding

tools/llvm-readobj/ELFDumper.cpp
1648 ↗(On Diff #218356)

That -> Doing this

1649 ↗(On Diff #218356)

This -> This behaviour

1650 ↗(On Diff #218356)

Remove "exist"

1651 ↗(On Diff #218356)

symbols -> symbol

1652 ↗(On Diff #218356)

But we -> However, we

1654 ↗(On Diff #218356)

sections table -> section table

grimar marked an inline comment as done.Sep 3 2019, 2:15 AM
grimar added inline comments.
test/tools/llvm-readobj/dyn-symbols.test
185 ↗(On Diff #218356)

Information in the section header has priority over the information
in a PT_DYNAMIC header.

So, I think we should probably prioritize the information in PT_DYNAMIC.

This is comment taken from here:
https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/ELFDumper.cpp#L1430

And it says that for locating the .dynamic we try to take the information from the sections header first and then fallback to PT_DYNAMIC if have no SHT_DYNAMIC section. That is how llvm-readeobj scans the object now.

So are you saying we should probably reimplement it to take the information from PT_DYNAMIC at first, right?

MaskRay added inline comments.Sep 3 2019, 4:15 AM
test/tools/llvm-readobj/dyn-symbols.test
185 ↗(On Diff #218356)

Yes, I think we should try PT_DYNAMIC first. I believe this is one of few things that GNU readelf -D (--use-dynamic) does. The option has been around for more than 20 years (it has been available since the commit 19990502 sourceware import). Fake .dynsym has been used by malware for years. I think we can just always default to -D.

grimar updated this revision to Diff 218454.Sep 3 2019, 7:43 AM
grimar marked 10 inline comments as done.
grimar edited the summary of this revision. (Show Details)
  • Reimplemented in according to discussion.
  • Added more test cases to verify the new behavior.
tools/llvm-readobj/ELFDumper.cpp
1648 ↗(On Diff #218356)

I had to rewrite this comment completely because of behavior change...

jhenderson added inline comments.Sep 4 2019, 9:19 AM
test/tools/llvm-readobj/dyn-symbols.test
220 ↗(On Diff #218454)

to address -> to an address

251 ↗(On Diff #218454)

of dynamic -> of the dynamic

274–275 ↗(On Diff #218454)

Won't this be implicitly generated?

tools/llvm-readobj/ELFDumper.cpp
1647 ↗(On Diff #218454)

dynamic symbols section -> dynamic symbol table

(technically, if there are no section headers, there's no dynamic symbol section)

1649 ↗(On Diff #218454)

has a priority -> has priority

1650 ↗(On Diff #218454)

in runtime -> at runtime

Normally the -> The
(should implies normally)

1652 ↗(On Diff #218454)

to address -> to an address

1655 ↗(On Diff #218454)

Using DynSymRegion.EntSize to determine if the dynamic symbol table has been found via a section header feels a bit weird to me. Please at least add a comment.

1659 ↗(On Diff #218454)

of dynamic -> of the dynamic

grimar updated this revision to Diff 218871.Sep 5 2019, 2:37 AM
grimar marked 10 inline comments as done.
  • Addressed review comments.
test/tools/llvm-readobj/dyn-symbols.test
274–275 ↗(On Diff #218454)

Yes, but this is probably not so obvious here? It probably can raise questions,
like: "will the default SHT_DYNSYM be generated if we already have one in the YAML?".
Having 2 described makes this test case pretty straightforward.

This revision is now accepted and ready to land.Sep 5 2019, 5:51 AM
MaskRay accepted this revision.Sep 5 2019, 5:56 AM
MaskRay added inline comments.
tools/llvm-readobj/ELFDumper.cpp
1660 ↗(On Diff #218871)

This warning looks good :) People will stay alert when they see this warning.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptSep 5 2019, 7:06 AM