This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] --add-symbol: fix crash if SHT_SYMTAB does not exist
ClosedPublic

Authored by MaskRay on Oct 17 2019, 2:16 AM.

Details

Summary

Exposed by D69041. If SHT_SYMTAB does not exist, ELFObjcopy.cpp:handleArgs will crash due
to a null pointer dereference.

for (const NewSymbolInfo &SI : Config.ELF->SymbolsToAdd) {
  ...
  Obj.SymbolTable->addSymbol(

Fix this by creating .symtab and .strtab on demand in ELFBuilder<ELFT>::readSections,
if --add-symbol is specified.

Diff Detail

Event Timeline

MaskRay created this revision.Oct 17 2019, 2:16 AM
grimar added inline comments.Oct 17 2019, 2:39 AM
test/tools/llvm-objcopy/ELF/add-symbol-no-symtab.test
5 ↗(On Diff #225379)

Will we want to remove this call when yaml2obj patch that supports omiting ".strtab" be landed?
If so - worth adding a TODOs, probably, I am not sure it is critical though.

30 ↗(On Diff #225379)

I'd say - "Check we create ... "

36 ↗(On Diff #225379)

".shstrtab" != ".strtab"

tools/llvm-objcopy/ELF/Object.cpp
1554 ↗(On Diff #225379)

Perhaps if (!StrTab)? Note that code above doesn't do if (Obj.SymbolTable != null for example.

1557 ↗(On Diff #225379)

Avoid auto?

grimar added inline comments.Oct 17 2019, 2:41 AM
test/tools/llvm-objcopy/ELF/add-symbol-no-symtab.test
5 ↗(On Diff #225379)

I meant ".symtab".

MaskRay marked 2 inline comments as done.Oct 17 2019, 2:46 AM
MaskRay added inline comments.
test/tools/llvm-objcopy/ELF/add-symbol-no-symtab.test
5 ↗(On Diff #225379)

I have added an explicit Symbols: to ensure .symtab exists, even after D69041.

This should make the TODO unnecessary?

For similar reasons, I use an explicit llvm-objcopy --remove-section=.symtab --remove-section=.strtab instead of llvm-strip to make it clear both sections are stripped below.

36 ↗(On Diff #225379)

Yes, we overload .shstrtab (section names only) for more purposes (section names+symbol names). I think this is fine, but it definitely deserves a comment.

(Similarly, clang reuse .strtab for both section names and symbol names.)

GNU objcopy does not allow such object files (I haven't looked into details what sections (.strtab or .symtab) it expects).

% objcopy --add-symbol=abs1=1 t t2
objcopy: error: the input file 'a' has no sections
grimar added inline comments.Oct 17 2019, 2:55 AM
test/tools/llvm-objcopy/ELF/add-symbol-no-symtab.test
5 ↗(On Diff #225379)

I have added an explicit Symbols: to ensure .symtab exists, even after D69041.

Yeah, that was my question. Do you plan to remove "Symbols:" and this call later or not.
I think it is not a problem to keep it, it makes things self-documented. So its OK as is.

MaskRay updated this revision to Diff 225383.Oct 17 2019, 2:59 AM
MaskRay marked 3 inline comments as done.
MaskRay edited the summary of this revision. (Show Details)

Address review comments

MaskRay added inline comments.Oct 17 2019, 3:02 AM
tools/llvm-objcopy/ELF/Object.cpp
1557 ↗(On Diff #225379)

In all existing call sites auto are used.. Anyway, SymbolTableSection is clearer (it is not obvious addSection<T> returns T&) so I will avoid auto.

MaskRay marked an inline comment as done.Oct 17 2019, 3:09 AM
MaskRay added inline comments.
test/tools/llvm-objcopy/ELF/add-symbol-no-symtab.test
36 ↗(On Diff #225379)

To expand a bit more on this topic.

For better test coverage, we should have a test that places .shstrtab before .strtab so that we can check the created .symtab picks .strtab as its sh_link field. But I don't know any way to reorder sections.

Since rL238073, clang no longer produces .shstrtab and uses a unified .strtab for both section names and symbol names. I cannot write something like

if (Sec.Type == ELF::SHT_STRTAB && Obj.SectionNames != &Sec)

because such use cases will fail. Again, I cannot test it with existing yaml2obj functionality (I am not sure it is useful enough to make it possible to test such scenarios... probably not worth it)

MaskRay edited the summary of this revision. (Show Details)Oct 17 2019, 3:10 AM
MaskRay updated this revision to Diff 225386.Oct 17 2019, 3:24 AM
if (Sec.Type == ELF::SHT_STRTAB) {

>

if (Sec.Type == ELF::SHT_STRTAB && !(Sec.Flags & SHF_ALLOC)) {

And add a test.

MaskRay marked 4 inline comments as done.Oct 17 2019, 3:27 AM
grimar accepted this revision.Oct 17 2019, 3:30 AM

LGTM with a 2 suggestions.

test/tools/llvm-objcopy/ELF/add-symbol-no-symtab.test
30 ↗(On Diff #225383)

So this should probably say "create both symbol table and string table".
I.e. use "string table" instead of ".strtab", because it is a bit confusing that
there is no any ".strtab" in the output.

36 ↗(On Diff #225379)

I see, thanks for the explanation.

tools/llvm-objcopy/ELF/Object.cpp
1546 ↗(On Diff #225383)

nit: I think you need to add curly bracers to wrap the 'for' body.

Perhaps something like the following?

for (auto &Sec : Obj.sections()) {
  if (Sec.Type == ELF::SHT_STRTAB)
    continue:
  StrTab = static_cast<StringTableSection *>(&Sec);
  // Prefer .strtab to .shstrtab.
  if (Obj.SectionNames != &Sec)
    break;
}
This revision is now accepted and ready to land.Oct 17 2019, 3:30 AM
MaskRay updated this revision to Diff 225396.Oct 17 2019, 4:18 AM
MaskRay marked 2 inline comments as done.

Address comments

This revision was automatically updated to reflect the committed changes.
rupprecht added inline comments.Oct 17 2019, 10:26 AM
llvm/tools/llvm-objcopy/ELF/Object.cpp
1547

Why do we need to check SHF_ALLOC?

1550–1552

Should .dynstr also be avoided? Is it better to just check that the name is literally .strtab?

MaskRay marked 2 inline comments as done.Oct 18 2019, 7:16 AM
MaskRay added inline comments.
llvm/tools/llvm-objcopy/ELF/Object.cpp
1547

.dynstr (SHT_STRTAB) has the SHF_ALLOC flag. We can't reuse .dynstr for .symtab (non-SHF_ALLOC).

The case is tested by # SEC3: in add-symbol-no-symtab.test

1550–1552

.dynstr is avoided by !(Sec.Flags & SHF_ALLOC). .strtab works as well but Obj.SectionNames != &Sec is more general. There are multiple ways to accomplish the samething. I just wanted to make the code less magical.

tools/llvm-objcopy/ELF/Object.cpp
1546 ↗(On Diff #225383)

Added the brace, but I will not the early-return pattern here. The body is short and I think it might be clearer as is.

rupprecht added inline comments.Oct 18 2019, 8:33 AM
llvm/tools/llvm-objcopy/ELF/Object.cpp
1550–1552

From: http://www.sco.com/developers/gabi/latest/ch4.sheader.html#special_sections

.strtab
This section holds strings, most commonly the strings that represent the names associated with symbol table entries. If the file has a loadable segment that includes the symbol string table, the section's attributes will include the SHF_ALLOC bit; otherwise, that bit will be off.

So, .strtab can also be SHF_ALLOC.

abrachet marked an inline comment as done.Oct 18 2019, 8:57 AM
abrachet added inline comments.
llvm/tools/llvm-objcopy/ELF/Object.cpp
1550–1552

For reference, we used to have a file in test/tools/llvm-objcopy/ELF/Inputs/ called alloc-symtab which had an SHF_ALLOC .strtab. It was removed in D65278.

MaskRay marked an inline comment as done.Oct 18 2019, 7:48 PM
MaskRay added inline comments.
llvm/tools/llvm-objcopy/ELF/Object.cpp
1550–1552

To confirm, I think ... && !(Sec.Flags & SHF_ALLOC) is correct here.

grimar added inline comments.Oct 21 2019, 3:01 AM
llvm/tools/llvm-objcopy/ELF/Object.cpp
1550–1552

For reference, we used to have a file in test/tools/llvm-objcopy/ELF/Inputs/ called alloc-symtab which had an SHF_ALLOC .strtab. It was removed in D65278.

Thanks for the reference. I completely forgot that saw an allocatable .symtab that time.

To clarify: in D65278 I've replaced a precompiled binary with a YAML description (with a SHF_ALLOC .symtab).
The logic of tests shouldn't have been affected, because tests wanted to see the SHF_ALLOC symbol table (but not SHF_ALLOC string table)
I believe.

If we need a test with a SHF_ALLOC .strtab, it should be possible to craft one with a yaml2obj too I think.
(I haven't check, but I think all that needed is to define a string table section and a Flag property which should override the default).

1550–1552

So, .strtab can also be SHF_ALLOC.

To confirm, I think ... && !(Sec.Flags & SHF_ALLOC) is correct here.

To summarize and clarify this a bit:
This part of code is called when there is no symbol table (.symtab).
Jordan mentioned that .strtab can be allocatable in according to specification.

So if I understand correctly, the question is: is it possible that
we have no symbol table, but have an allocatable .strtab in the object?

If it is, this code might either use the .shstrtab instead of existent .strtab,
or perform an attemp to create a new string table (if there is no .shstrtab to reuse) it seems.
Then it will create a .symtab using the string table either found or created instead of the existent allocatable '.strtab'.
Does not sound correct perhaps?

MaskRay marked an inline comment as done.Oct 21 2019, 10:10 AM
MaskRay added inline comments.
llvm/tools/llvm-objcopy/ELF/Object.cpp
1550–1552

Both SHF_ALLOC .strtab and non-SHF_ALLOC .strtab are possible, though I don't think ld or objcopy could create SHF_ALLOC .strtab.

Generally objcopy should not alter SHF_ALLOC sections. The code skips both .dynstr (SHF_ALLOC) and SHF_ALLOC .strtab. SHF_ALLOC .strtab is not tested, though.

rupprecht added inline comments.Oct 21 2019, 10:31 AM
llvm/tools/llvm-objcopy/ELF/Object.cpp
1550–1552

Yes, exactly. Repro:

$ cat strtab.yaml
--- !ELF
FileHeader:
  Class:   ELFCLASS64
  Data:    ELFDATA2LSB
  Type:    ET_REL
  Machine: EM_X86_64
Sections:
  - Name: .strtab
    Type: SHT_STRTAB
    Flags: [ SHF_ALLOC ]
$ yaml2obj strtab.yaml > strtab.o
$ llvm-objcopy --add-symbol=abs1=1 strtab.o strtab2.o
$ llvm-readobj --sections strtab2.o
...
  Section {
    Index: 1
    Name: .strtab (11)
    Type: SHT_STRTAB (0x3)
    Flags [ (0x2)
      SHF_ALLOC (0x2)
    ]
  }
  Section {
    Index: 2
    Name: .shstrtab (1)
    Type: SHT_STRTAB (0x3)
    Flags [ (0x0)
    ]
  }
  Section {
    Index: 3
    Name: .symtab (19)
    Type: SHT_SYMTAB (0x2)
    Flags [ (0x0)
    ]
    Link: 2  # <-- .shstrtab, not .strtab
  }
...

Running the same example through a recent GNU objcopy, it looks like it creates a second .strtab without the SHF_ALLOC bit and links to that:

$ llvm-objcopy --add-symbol=abs1=1 strtab.o strtab2.o
$ llvm-readobj --sections strtab2.o
  Section {
    Index: 1
    Name: .strtab (9)
    Type: SHT_STRTAB (0x3)
    Flags [ (0x2)
      SHF_ALLOC (0x2)
    ]
  }
  Section {
    Index: 2
    Name: .symtab (1)
    Type: SHT_SYMTAB (0x2)
    Flags [ (0x0)
    ]
    Link: 3 # <-- non-SHF_ALLOC .strtab below, not the SHF_ALLOC .strtab above
  }
  Section {
    Index: 3
    Name: .strtab (9)
    Type: SHT_STRTAB (0x3)
    Flags [ (0x0)
    ]
  }
  Section {
    Index: 4
    Name: .shstrtab (17)
    Type: SHT_STRTAB (0x3)
    Flags [ (0x0)
    ]
  }

This behavior seems reasonable to me vs re-using the .shstrtab for strings that aren't section headers.

To confirm, I think ... && !(Sec.Flags & SHF_ALLOC) is correct here.

BTW, I don't find it particularly constructive to simply state a position without any rationale.

rupprecht added inline comments.Oct 21 2019, 10:41 AM
llvm/tools/llvm-objcopy/ELF/Object.cpp
1550–1552

Running the same example through a recent GNU objcopy, it looks like it creates a second .strtab without the SHF_ALLOC bit and links to that:

$ llvm-objcopy --add-symbol=abs1=1 strtab.o strtab2.o
$ llvm-readobj --sections strtab2.o

This second example is actually what happens w/ GNU objcopy, not llvm-objcopy.

rupprecht added inline comments.Oct 21 2019, 11:05 AM
llvm/tools/llvm-objcopy/ELF/Object.cpp
1593–1599

Obj.SectionNames seems to be first initialized here, so the check above doesn't work.

MaskRay marked an inline comment as done.Oct 21 2019, 11:08 AM
MaskRay added inline comments.
llvm/tools/llvm-objcopy/ELF/Object.cpp
1550–1552

Reusing .shstrtab is what SEC3 tests. It may look a bit strange. e_shstrndx may refer to .strtab (clang reuses the section for both section names and symbol names). If we want to behave like GNU objcopy, we will have to special case the string .strtab. In any case, I don't think the (1) .symtab does not exist (2) --add-symbol= causes .symtab to be created edge case needs more treatment for GNU compatibility.

grimar added inline comments.Oct 27 2019, 11:26 AM
test/tools/llvm-objcopy/ELF/add-symbol-no-symtab.test
36 ↗(On Diff #225379)

For better test coverage, we should have a test that places .shstrtab before .strtab so that we can check the created .symtab picks .strtab as its sh_link field. But I don't know any way to reorder sections.

btw, I've realized that actually you can make a test for this with yaml2obj:

--- !ELF
FileHeader:
  Class:   ELFCLASS64
  Data:    ELFDATA2LSB
  Type:    ET_REL
  Machine: EM_X86_64
Sections:
  - Name: .shstrtab
    Type: SHT_STRTAB
  - Name: .strtab
    Type: SHT_STRTAB
Symbols:
  - Name: a1
    Binding: STB_GLOBAL
Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .shstrtab         STRTAB           0000000000000000  00000040
       000000000000001b  0000000000000000           0     0     0
  [ 2] .strtab           STRTAB           0000000000000000  0000005b
       0000000000000004  0000000000000000           0     0     0
  [ 3] .symtab           SYMTAB           0000000000000000  00000060
       0000000000000030  0000000000000018           2     1     8
jhenderson added inline comments.Oct 28 2019, 8:18 AM
llvm/test/tools/llvm-objcopy/ELF/add-symbol-no-symtab.test
28

You don't necessarily need to update your test here, but in future, could you avoid reusing %t, %t1 etc for different cases if they get modified (e.g. %t is fine to use in this line, but the output should be %t3 or whatever to avoid clashing with the previous case). It makes debugging somewhat easier if there are problems with the test.

32

This comment seems stale? We are reusing .shstrtab, and nothing shows we create .strtab here. Should the test also show that .strtab has NOT been created?

llvm/tools/llvm-objcopy/ELF/Object.cpp
1544

the existing -> an existing (since there can be multiple)
if exists -> if it exists

1550–1552

In an ideal world, the names of ELF sections should be largely irrelevant. I think rather than explicitly prefer ".strtab" over ".shstrtab", it's more correct to say we prefer a string table that is not the section header string table, if such a table exists.

I'm quite alright with a new string table to be created if needed for symbol names, if the existing one is SHF_ALLOC. I don't have a preference for the name (.strtab seems fine, even if it already exists). Is there anything stopping the section header string table having the SHF_ALLOC bit set however? If not, we need a separate test for this case, I believe.