Page MenuHomePhabricator

[llvm-objcopy] Emit error if removing symbol table referenced by SHT_GROUP section
ClosedPublic

Authored by jubnzv on Jun 21 2020, 1:19 AM.

Details

Summary

SHT_GROUP sections contain a reference to a symbol indicating their "signature" symbol. The symbol table containing this symbol is referred to by the group section's sh_link field. If llvm-objcopy is instructed to remove the symbol table, it will emit an error.

This fixes https://bugs.llvm.org/show_bug.cgi?id=46153.

Diff Detail

Event Timeline

jubnzv created this revision.Jun 21 2020, 1:19 AM
jhenderson requested changes to this revision.Jun 22 2020, 12:23 AM

Thanks for taking a look @jubnzv. It's a good start, but there are some things more to look at. In particular, the test is currently broken.

llvm/test/tools/llvm-objcopy/ELF/group.test
58–59

You weren't to know this, but in our newer tests, at least in the LLVM binutils area, we tend to use '##' for comments to make them stand out better from # RUN and similar lines to do with FileCheck. Could you update here, please?

60

There's no need for >/dev/null. That throws away the stdout, which isn't a thing you need to worry about, as llvm-objcopy does not write to stdout.

Note also that the pre-merge bot indicates that your test is currently failing. Please make sure to run the test locally before updating your patch to make sure the test actually works. In this case, the problem is likely %t.o does not exist (the input file should probably be %t). If the test was passing for you locally, check your test output folder to see if there is a %t.o file (actually something like .../group.test.tmp.o) there, and delete it if there is. The test output directory can be found inside your build directory.

61

I find it a little easier to read if there is a blank line between the RUN and the CHECK (or ERR in this case) line.

62–63

llvm-objcopy doesn't modify its input file on error, so there's no need to check it here afterwards.

65

Same comment as above for '##'. Also, missing trailing '.' at the end of the sentence.

67–70

You could fold these two llvm-readobj lines together. If you switch to using llvm-readelf (or llvm-readobj --elf-output-style=GNU, which is essentially the same thing), you'll notice that it has both the sections listed, and the number of sections. I'd then check the full list rather than rely on --implicit-check-not, as it's a little safer against changing output, but it doesn't make much difference. Approximate example:

# SECTIONS:      Section header table contains 6 entries:
## FYI, no need to check the full output, just the name column. FileCheck by default doesn't need to match an entire line.
# SECTIONS-NEXT: Name
# SECTIONS-NEXT: .foo
# SECTIONS-NEXT: .bar
<etc etc>
llvm/tools/llvm-objcopy/ELF/Object.cpp
972

GroupSection already has a SymTab pointer, which should point at the symbol table, if I'm not mistaken. There's no need to use the LinkSection member of Section.

As an aside, the preferred style for error messages is lower-case for the first letter too, not upper-case as you have here.

976

Similar to above, use SymTab, not LinkSection in this method.

What happens if LinkSection (later SymTab) is a nullptr? Please double check to see whether ToRemove handles nullptr or not. If it doesn't, you'll need to handle it here and will want a test that shows the behaviour in this situation.

1367–1373

I don't think you need to make these changes.

llvm/tools/llvm-objcopy/ELF/Object.h
790–792 ↗(On Diff #272296)

It seems to me like moving this and changing the GroupSection base class is unnecessary. What's the motivation behind it?

The reason I observed in the bug that GroupSection is not a Section derivative is because Section already contains the behaviour for preventing the LinkSection being removed by default. I think better would be to not change the base class, and just implement the same error message as already exists in Section::removeSectionReferences, since we want to do something slightly differently.

This revision now requires changes to proceed.Jun 22 2020, 12:23 AM
jubnzv updated this revision to Diff 272889.Jun 23 2020, 6:59 PM

@jhenderson thank you very much for your thoughtful review!

Yes, the problem with the failed test was caused by the fact that I use locally cached outputs with previous version of the test. Now I've fixed it and everything works.
I also corrected my patch according your guidances.

jubnzv marked an inline comment as done.Jun 23 2020, 7:31 PM
jubnzv added inline comments.
llvm/tools/llvm-objcopy/ELF/Object.cpp
976

ToRemove is represented as find method of std::unordered_set, so it will returns past-the-end iterator if it can't find the requested key. So ToRemove could be used without additional null check.

I've also added additional test for possible regressions.

Tip: before updating the diff, select the little "Done" check box on inline comments. This will make it clear to the reviewer that you've read and addressed that comment. The "marked as done" action will then be automatically submitted when you update the diff.

llvm/test/tools/llvm-objcopy/ELF/group.test
71

Thinking about it, we should show what the value of the .group section's sh_link field here is. It's one of the later columns, so you'll want to expand this line to check up to that point (the rest of the values don't matter, so consider something like: .group {{.*}} {{.*}} ... 0. Also update the column header line to extend to the Link column.

77

I think this test case is useful in principle but needs some more work, but the description would probably be better changed to something like:

"Show that llvm-objcopy can handle a group section with a zero sh_link field".

78

I assume you didn't mean to just repeat the previous llvm-objcopy run with the exact same input, and that what you actually meant to do was show that the output of the previous run can be used as an input to llvm-objcopy again?

Assuming I've got that right, you need to update your test input/output file. I also don't think "exit with return code 0" is a particularly useful test on its own - int main(){} passes that test after all. It would be better to show that the input and output files are identical after the run.

jubnzv updated this revision to Diff 273061.Jun 24 2020, 9:11 AM
jubnzv marked 12 inline comments as done.

@jhenderson thanks for the tip. I added some checks for sh_link field and updated the test.

jhenderson added inline comments.Jun 25 2020, 1:08 AM
llvm/test/tools/llvm-objcopy/ELF/group.test
68–69

I'd add extra spaces to make this line up with the rows below. Additionally, I'd add the headers for the columns you have to match below, to clarify which you are actually checking.

71

I made a slight mistake in my original proposal - these should be {{.+}}. Additionally, you don't need to check anything after the fields you care about.

77

On further look, this should be "zero sh_link and sh_info fields" instead of just "a zero sh_link field" I think?

78

To assist with debugging, I'd use a different output file, so that the input file isn't overwritten. I'd them cmp the output with the input, since they should match. As things stand, this is still no better than an empty main, since you are only checking that llvm-objcopy returns 0, and not that it actually does what you expect too.

jubnzv updated this revision to Diff 273264.Jun 25 2020, 2:13 AM
jubnzv marked 4 inline comments as done.

I agree, this makes it cleaner.

I also found a strange behavior of {{.*}} regex in group.test. It seems that it doesn't match an empty sequence.

For example, when I trying to match the second line here (empty Flg field):

[Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
[ 1] .group            GROUP           0000000000000000 000040 000008 04     0  0   4

This regex works as expected:

# SECTIONS-NEXT: .group    {{.+}} {{.+}}  {{.+}} {{.+}} {{.+}}     0  0

But this one doesn't match the line:

# SECTIONS-NEXT: .group    {{.+}} {{.+}}  {{.+}} {{.+}} {{.+}} {{.*}} 0  0
jhenderson accepted this revision.Jun 25 2020, 3:43 AM

I agree, this makes it cleaner.

I also found a strange behavior of {{.*}} regex in group.test. It seems that it doesn't match an empty sequence.

For example, when I trying to match the second line here (empty Flg field):

[Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
[ 1] .group            GROUP           0000000000000000 000040 000008 04     0  0   4

This regex works as expected:

# SECTIONS-NEXT: .group    {{.+}} {{.+}}  {{.+}} {{.+}} {{.+}}     0  0

But this one doesn't match the line:

# SECTIONS-NEXT: .group    {{.+}} {{.+}}  {{.+}} {{.+}} {{.+}} {{.*}} 0  0

Without investigating it in too much depth, I suspect you've tripped over FileCheck's canonicalization of whitespace. Essentially, FileCheck normalises all consecutive whitespace in both the pattern and the input to match against into a single space character. Therefore, if your input is something like xyz abc, and your pattern is xyz abc FileCheck will treat them both as xyz abc, meaning it will match. I suspect, but haven't checked, that this is applied before interpreting regexes. Therefore xyz {{.*}} abc will be treated as xyz a regex matching an empty string and abc in sequence. Thus the final pattern string being used to match with will contain 2 spaces (one either side of the "empty" regex). That means that it won't match xyz abc, or xyz abc or even xyz abc (because those are all treated as having only one space). You can use --strict-whitespace to disable this canonicalization. Take a look at the documentation for details. I hope that makes sense.

Anyway, the change as-is LGTM.

This revision is now accepted and ready to land.Jun 25 2020, 3:43 AM
jubnzv accepted this revision.EditedJun 26 2020, 8:00 PM

@jhenderson thanks for helping me. I'll remember these tips for next time.

Could you please commit this for me?

Higuoxing accepted this revision.Jun 26 2020, 8:25 PM
Higuoxing added a subscriber: Higuoxing.

LGTM with one nit.

llvm/test/tools/llvm-objcopy/ELF/group.test
59

'have the a group' -> 'have a'

jubnzv accepted this revision.Jun 26 2020, 9:20 PM
jubnzv updated this revision to Diff 273882.
jubnzv marked an inline comment as done.

@jhenderson thanks for helping me. I'll remember these tips for next time.

Could you please commit this for me?

Can do. What do you want for your Git author name and email?

@jhenderson thanks for helping me. I'll remember these tips for next time.

Could you please commit this for me?

Can do. What do you want for your Git author name and email?

Georgy Komarov <jubnzv@gmail.com>

This revision was automatically updated to reflect the committed changes.