This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Fix crash when removing symbol table at same time as adding a symbol
ClosedPublic

Authored by jubnzv on Jul 1 2020, 12:43 AM.

Details

Summary

This patch resolves crash that occurs when user wanted to remove all
symbols and add a brand new one using:

llvm-objcopy -R .symtab --add-symbol foo=1234 in.o out.o

Before these changes the symbol table internally being null when adding
new symbols. For now we will regenerate symtab in this case.

This fixes: https://bugs.llvm.org/show_bug.cgi?id=43930

Diff Detail

Event Timeline

jubnzv created this revision.Jul 1 2020, 12:43 AM

I don't see a benefit in having two different places where the symbol table can be created. I think you probably just want to completely remove the logic from ELFBuilder for creating a symbol table. With a bit of care, you then should be able to create the symbol table when you need to rather than early, and at the same time remove the whole EnsureSymtab logic from the ELFBuilder/ELFReader stack. You'll need to see where things reference the symbol table though to make sure there's no risk of links not being set up properly.

llvm/test/tools/llvm-objcopy/ELF/add-symbol-no-symtab.test
111 ↗(On Diff #274703)

You shouldn't need --allow-broken-links.

%t has no symbols in it. I think this test case would be better off using an input file that does contain symbols. I'd recommend splitting this case off into a new file and adding your own YAML with one or more symbols in, showing that the old symbols have been deleted and that new ones have been added. You'll want to show that both the new symbols and the new symtab section header are correct too.

113 ↗(On Diff #274703)

Nit: The norm is to not put comment markers on blank line.

jubnzv updated this revision to Diff 274787.Jul 1 2020, 7:36 AM
jubnzv marked 3 inline comments as done.

Thanks for the comment @jhenderson. I updated the test.

I think you probably just want to completely remove the logic from ELFBuilder for creating a symbol table. With a bit of care, you then should be able to create the symbol table when you need to rather than early, and at the same time remove the whole EnsureSymtab logic from the ELFBuilder/ELFReader stack.

It is possible to move logic for creating symbol table in a free function or make method in the Object class that performs initialization on demand.
But the problem is that ELFBuilder also sets reference symbols for SH_LINK and SH_INFO fields of relocation sections when creating a new Object. To perform this, it needs an initialized symbol table. All cases that requires symtab are better seen at D69093 diff: https://reviews.llvm.org/differential/changeset/?ref=1658725.
So I will also need to separate set relocations logic and apply it only when it needed, after creating an Object. I think it makes code more hard to read.
Or are simpler solutions possible?

Thanks for the comment @jhenderson. I updated the test.

I think you probably just want to completely remove the logic from ELFBuilder for creating a symbol table. With a bit of care, you then should be able to create the symbol table when you need to rather than early, and at the same time remove the whole EnsureSymtab logic from the ELFBuilder/ELFReader stack.

It is possible to move logic for creating symbol table in a free function or make method in the Object class that performs initialization on demand.
But the problem is that ELFBuilder also sets reference symbols for SH_LINK and SH_INFO fields of relocation sections when creating a new Object. To perform this, it needs an initialized symbol table. All cases that requires symtab are better seen at D69093 diff: https://reviews.llvm.org/differential/changeset/?ref=1658725.
So I will also need to separate set relocations logic and apply it only when it needed, after creating an Object. I think it makes code more hard to read.
Or are simpler solutions possible?

Actually, does the createSymbolTable function even need to be a member of ELFBuilder? My main objection is the use of the builder a long time after initialization of the Object (along with the complexity of calling it due to the template), but I don't think you actually need it.

jubnzv updated this revision to Diff 275294.Jul 2 2020, 10:32 PM

Actually, does the createSymbolTable function even need to be a member of ELFBuilder? My main objection is the use of the builder a long time after initialization of the Object (along with the complexity of calling it due to the template), but I don't think you actually need it.

Yes, I agree, this template call looked clumsily. I wasn't sure which class in the hierarchy should be responsible for symtab creation.
I think we can define a method in the Object itself.

Nearly there, just a few small changes to make now, I think.

llvm/test/tools/llvm-objcopy/ELF/add-symbol-new-symtab.test
2–3

Perhaps this comment could be simplified slightly:

"Show that llvm-objcopy can replace the symbol table with a new one."

5

In general, I find tests easier to follow when related RUN lines are kept together. Thus I'd go:

# RUN: yaml2obj ...
# RUN: llvm-objcopy ...
# RUN: llvm-readelf ... | FileCheck %s

# CHECK: ...

<YAML>
37

I don't think you need this llvm-readelf line, or the corresponding checks, since it is just verifying yaml2obj does the right thing, but we can assume it does, since there is testing for that in the yaml2obj testing.

50

I assume you are using it to show the sh_link value of the new symbol table is correct? If so, you a) can add -S to the llvm-readelf call above and b) just use the same check-prefix for everything (CHECK is fine, since you will only have the one FileCheck command).

58

Aside from the fact that the checks and commands are not in a sensible order, you don't need two separate commands, as I note in my previous comment.

llvm/tools/llvm-objcopy/ELF/Object.cpp
1909

I think you want something, probably an assert, that prevents this function being called if there already is a symbol table in the Object.

1911

Whilst you're moving this code, please change this to not use auto. I was a bit lax earlier in my reviewing of this code when it was all first written, and I find auto in this context makes it harder to figure out exactly what sort of section class Sec actually is.

llvm/tools/llvm-objcopy/ELF/Object.h
1043

I'd call the new function addSymbolTable (or possibly addNewSymbolTable and put it next to the addSection function in the header and code as conceptually, the two are related.

jubnzv updated this revision to Diff 275333.Jul 3 2020, 2:23 AM
jubnzv marked 8 inline comments as done.

Thanks. Fixed.

jhenderson added inline comments.Jul 3 2020, 2:56 AM
llvm/test/tools/llvm-objcopy/ELF/add-symbol-new-symtab.test
5

You haven't moved the YAML to the end, as I asked...

45

Since this is a new section, I'd check the values are all correct, rather than skipping them. Don't forget the alignment column!

llvm/tools/llvm-objcopy/ELF/Object.cpp
1909

Add a && "some explanatory message to this assert.

jubnzv updated this revision to Diff 275349.Jul 3 2020, 3:51 AM
jubnzv marked 3 inline comments as done.
jubnzv marked an inline comment as not done.Jul 3 2020, 3:55 AM
jubnzv added inline comments.
llvm/test/tools/llvm-objcopy/ELF/add-symbol-new-symtab.test
45

I agree. I think it will be more readable if I add values for the other columns as well.
The other tests for llvm-objcopy are written in the same way.

jhenderson added inline comments.Jul 3 2020, 4:21 AM
llvm/test/tools/llvm-objcopy/ELF/add-symbol-new-symtab.test
45

It's not just about readability, it's about correctness. You could have added a new section called .symtab, with invalid properties easily (e.g. incorrect type, or a non-zero info field, for example). Without the explicit value checks, this would fail.

One exception is that you don't need to check the Offset field. The reason for that is because if we changed something internal in llvm-objcopy that changed the position the symbol table is put in the file, it would cause this test to break, but the Offset isn't actually something you control by simply adding the section, plus the ability to dump the symbol table validates that as a side-effect.

jubnzv updated this revision to Diff 275367.Jul 3 2020, 5:16 AM
jubnzv marked an inline comment as not done.
jhenderson accepted this revision.Jul 3 2020, 5:26 AM

LGTM, with one suggestion.

If you haven't already, you should be able to request commit access, as outlined in the docs.

llvm/test/tools/llvm-objcopy/ELF/add-symbol-new-symtab.test
7

Add --match-full-lines to the FileCheck call. This will prevent there being spurious data at the end of the line. For example, if a section's alignment were 16, a check of 1 would still pass, because FileCheck only does partial matches by default.

This revision is now accepted and ready to land.Jul 3 2020, 5:26 AM
jubnzv marked an inline comment as done.Jul 3 2020, 5:29 AM
jubnzv added inline comments.
llvm/test/tools/llvm-objcopy/ELF/add-symbol-new-symtab.test
45

Understood, thanks for the explanation.
Interestingly, that the offset value generated by objcopy --add-section from GNU binutils and llvm-objcopy differs due to different internal implementation.

jubnzv updated this revision to Diff 275376.Jul 3 2020, 5:34 AM
This revision was automatically updated to reflect the committed changes.