This is an archive of the discontinued LLVM Phabricator instance.

[ELF][llvm-objcopy] Reject duplicate SHT_SYMTAB sections.
ClosedPublic

Authored by MosheBerman on Feb 7 2023, 8:51 AM.

Details

Summary

Reject duplicate SHT_SYMTAB sections

The gABI prohibits multiple SH_SYMTAB sections. As a result, llvm-objcopy was crashing in SymbolTableSection::removeSymbols(). This patch fixes the issue by emitting an error if multiple SH_SYMTAB sections are encountered when building an ELF object.

Diff Detail

Event Timeline

MosheBerman created this revision.Feb 7 2023, 8:51 AM
MosheBerman requested review of this revision.Feb 7 2023, 8:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2023, 8:51 AM
MosheBerman edited the summary of this revision. (Show Details)Feb 7 2023, 8:51 AM
MosheBerman edited the summary of this revision. (Show Details)Feb 7 2023, 8:55 AM
MosheBerman edited the summary of this revision. (Show Details)
MosheBerman edited the summary of this revision. (Show Details)Feb 7 2023, 9:20 AM

Changed punctuation in test so it matches the code.

Hi @MosheBerman,

  • You've uploaded your diff without full context. This can make it harder to review (although in this case it's fine). Please make sure to always upload your diff with full context, as per the instructions on the LLVM website.

Do we want to assert here instead if an if check? (I looked at the relevant parts of the LLVM Coding Standards for guidance on formatting the logic checking, and found "assert liberally" section the but didn't see anything else explicit about this kind of check. )

  • Asserts are for programmer errors within LLVM. They are not for end-user facing errors. Another way you could think about it is a user could somehow craft an input that hits this check, it isn't an assert.

Is there anything we can add to the error to guide the reader to unblock themselves? (I included the quote from the ELF gABI docs in the error message to make it more searchable online.)

  • We can't know how a user created the input, therefore it is impossible to give them more guidance than "don't have multiple symbol tables in your input" (which is essentially what the error says).
llvm/lib/ObjCopy/ELF/ELFObject.cpp
1707

I'd use the term "Multiple" not "Duplicate", because "Duplicate" implies identical, which they don't need to be.

1710–1711

If you wanted to provide the user with more information, about the only information you could give them that would be useful are the section indices of the two symbol tables. However, without further changes, I don't believe this information is currently available here. I don't think it's worth complicating the interface of the function to support it either, as it should be fairly easy for the end user to identify the two (or more!) sections causing the issue using tools like llvm-readelf.

I think you can slightly simplify this message to "multiple SHT_SYMTAB sections are not supported". There's no need for the "Currently..." part of the message, since that's implied by the fact that we're emitting the error in the first place.

llvm/test/tools/llvm-objcopy/ELF/symtab-duplicate.test
1 ↗(On Diff #495653)

It might be useful to add a comment to the top of this test file with the gABI quote in, in case anybody runs into this behaviour and wonders about it. This also gives the test a little more context.

2–3 ↗(On Diff #495653)

These lines can be simplified as per the inline edit.

I'd also add a blank line between the RUN and CHECK lines, as indicated, and I'd check the full error message that llvm-objcopy produces.

9 ↗(On Diff #495653)

ET_DYN should probably be ET_REL, although it doesn't really matter (ET_DYN files usually have dynamic symbols etc).

10 ↗(On Diff #495653)

I believe the Machine line is unnecessary these days, as a default value is provided that should be sufficient (I could be getting mixed up with something else though, so if an error starts to be generated after removing it, put it back in!).

14 ↗(On Diff #495653)

Symbol tables generally don't have the SHF_ALLOC flag, so you can delete this and the equivalent line below.

18–19 ↗(On Diff #495653)

Looks like there's trailing whitespace on the final line, hence the "No newline at end of file" message, even though there is a blank line here.

MosheBerman added a comment.EditedFeb 8 2023, 7:53 AM

Hi @MosheBerman,

  • You've uploaded your diff without full context. This can make it harder to review (although in this case it's fine). Please make sure to always upload your diff with full context, as per the instructions on the LLVM website.

Thanks for pointing this out. Can I trouble you for a link to the specific page on the LLVM website, or an example of good context, so I know what to do for next time? (I looked at the Contributing and Developer Policy pages for "context" and did not see anything.

Do you mean in the patch itself?

llvm/lib/ObjCopy/ELF/ELFObject.cpp
1707

Great point. I struggled with the word choice in the message, and I did use "multiple" there for that exact reason. I'll change it here. Let's also rename the test file to match.

1710–1711

If you wanted to provide the user with more information, about the only information you could give them that would be useful are the section indices of the two symbol tables. However, without further changes, I don't believe this information is currently available here.

I experimented with displaying the section names with the Shdr parameter and was only able to get the Name from the original section.

as it should be fairly easy for the end user to identify the two (or more!) sections causing the issue using tools like llvm-readelf.

How would you feel about mentioning that in the message?

There's no need for the "Currently..." part of the message, since that's implied by the fact that we're emitting the error in the first place.

Fair. The reason I included that phrase was because Google searching for that phrase matches the docs and leads right to it. Do you think there's a benefit to doing so, or is the goal brevity?

This is all helpful feedback, my goal was to go beyond "something broke" to "something broke. here's how to fix it." I'll definitely make changes based on your responses to my two questions. (1. Mentioning llvm-readelfand 2. Searchable phrase in the output.)

MosheBerman added inline comments.Feb 8 2023, 8:00 AM
llvm/test/tools/llvm-objcopy/ELF/symtab-duplicate.test
14 ↗(On Diff #495653)

Do I delete the contents of the brackets or the entire line?

(Sidebar, what's the idiomatic term for the brackets here? Is it 'Array,' 'List,' 'Set,' or something else?

18–19 ↗(On Diff #495653)

Ooh, thanks for explaining this. (My VS Code is auto-formatting incorrectly) I'll run clean it up.

MosheBerman added inline comments.Feb 8 2023, 8:13 AM
llvm/test/tools/llvm-objcopy/ELF/symtab-duplicate.test
10 ↗(On Diff #495653)

Tests pass locally without it.

MosheBerman marked 2 inline comments as done.

Address feedback.

MosheBerman marked an inline comment as done.Feb 8 2023, 9:43 AM
MosheBerman marked an inline comment as done.

Updating with tests passing locally.

Hey, I think this is ready for a second review. Is there anything else I need to do?

Hey, I think this is ready for a second review. Is there anything else I need to do?

Sorry, I was off sick for a couple of days last week, and have a bit of a backlog of other work to work my way through. I'll get to this later this week though.

Hi @MosheBerman,

  • You've uploaded your diff without full context. This can make it harder to review (although in this case it's fine). Please make sure to always upload your diff with full context, as per the instructions on the LLVM website.

Thanks for pointing this out. Can I trouble you for a link to the specific page on the LLVM website, or an example of good context, so I know what to do for next time? (I looked at the Contributing and Developer Policy pages for "context" and did not see anything.

Do you mean in the patch itself?

I'm assuming you've been uploading the diffs via the web UI? You need to ensure you've created the diff using something like git diff -U9999999 (see https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface). By default git diffs only have a very limited context available.

llvm/lib/ObjCopy/ELF/ELFObject.cpp
1710–1711

I don't think your second sentence is really all that useful. Chances are that if a user is using llvm-objcopy and have an ELF in this weird format, they already know about llvm-readelf and how to inspect objects. They may well also know about the ELF gABI, and it appears on the front page of Google currently if you search using the exact error message.

We should aim for diagnostic messages to be clear and concise. In this case, the additional information you've provided is speculative and to many users potentially unhelpful. Including the section names on the other hand could potentially be useful, since that will help the user identify the offending sections.

llvm/test/tools/llvm-objcopy/ELF/symbol-prohibit-multiple-symtab-sections.test
1–2

We use ## for comments in llvm-objcopy tests, to help them stand out from the real test lines.

The test name can be simplified to "multiple-symtab.test". The current name is unnecessarily verbose, and if the behaviour ever changes, this test could be adapted to cover the new behaviour instead, without needing to rename it. In other words the test name here can just describe the input.

Thanks for the review @jhenderson, and I hope you're feeling better. I incorporated all of the feedback:

  • Renamed test file to match feedback.
  • Cleaned up the error output and updated test.

Also, I see what you mean about context. I think that I included the whole file this time, using git show HEAD -U999999 | bcopy and pasting into the web UI.

jhenderson added inline comments.Feb 15 2023, 1:35 AM
llvm/lib/ObjCopy/ELF/ELFObject.cpp
1710

Nit: as per the coding standards, error messages shouldn't have a trailing full stop.

llvm/test/tools/llvm-objcopy/ELF/multiple-symtab.test
2 ↗(On Diff #497535)

The comment should also say what the purpose of this test is, i.e. add something like "This test shows that we emit an error if we encounter multiple SHT_SYMTAB sections."

MosheBerman marked an inline comment as done.

Addressed feedback:

  • Cleaned up error message format.
  • Added more context to test.
MosheBerman marked an inline comment as done.Feb 15 2023, 7:27 AM
jhenderson accepted this revision.Feb 15 2023, 9:28 AM

Code LGTM. Could you reformat the description so that it matches your planned commit message, please?

Do you have commit access?

This revision is now accepted and ready to land.Feb 15 2023, 9:28 AM

Code LGTM. Could you reformat the description so that it matches your planned commit message, please?

Great! I'll update the description.

Do you have commit access?

I do not. This is my first diff, so I think I'm supposed to have a committer land it for me. I'll follow your lead.

MosheBerman edited the summary of this revision. (Show Details)Feb 15 2023, 11:00 AM
MosheBerman edited the summary of this revision. (Show Details)

Updated the diff summary in Phabricator. lmk if that wasn't the description you meant.

Code LGTM. Could you reformat the description so that it matches your planned commit message, please?

Great! I'll update the description.

Do you have commit access?

I do not. This is my first diff, so I think I'm supposed to have a committer land it for me. I'll follow your lead.

Okay, please let me know your name and email address that should be listed in the commit message.

FYI, there's a typo in your commit message ("craashing"). I'll fix that when landing this, unless you get there first.

MosheBerman edited the summary of this revision. (Show Details)

Fixed the commit message.

Code LGTM. Could you reformat the description so that it matches your planned commit message, please?

Great! I'll update the description.

Do you have commit access?

I do not. This is my first diff, so I think I'm supposed to have a committer land it for me. I'll follow your lead.

Okay, please let me know your name and email address that should be listed in the commit message.

My name is Moshe Berman, and my email is mosheberman@users.noreply.github.com.

FYI, there's a typo in your commit message ("craashing"). I'll fix that when landing this, unless you get there first.

I attempted to fix it just now. Feel free to tweak the commit message.

I've gone ahead and landed this commit. Please keep an eye out for any failure emails from build bots that can be attributed to this patch (NB: there may be spurious failures that are unrelated). If necessary, reach out to someone else to help fix or revert this, as I'm now finishing work for the weekend.

I've gone ahead and landed this commit. Please keep an eye out for any failure emails from build bots that can be attributed to this patch (NB: there may be spurious failures that are unrelated). If necessary, reach out to someone else to help fix or revert this, as I'm now finishing work for the weekend.

Thank you! I’ll keep an eye out. As an FYI, I’m completely offline from Friday night to Saturday night. (Religious day of rest.) I’ll check later today and on Saturday night.

I've gone ahead and landed this commit. Please keep an eye out for any failure emails from build bots that can be attributed to this patch (NB: there may be spurious failures that are unrelated). If necessary, reach out to someone else to help fix or revert this, as I'm now finishing work for the weekend.

In addition to my previous comment, thank you for this opportunity and your patient feedback.