This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Add support for large indexes
ClosedPublic

Authored by jakehehrlich on Jan 24 2018, 6:29 PM.

Details

Summary

Because of -ffunction-sections (and maybe other use cases I'm not aware of?) it can occur that we need more than 0xfeff sections but ELF dosn't support that many sections. To solve this problem SHN_XINDEX exists and with it come a whole host of changes for section indexes everywhere. This change adds support for those cases which should allow llvm-objcopy to copy binaries that have an arbitrary number of sections.

Diff Detail

Repository
rL LLVM

Event Timeline

jakehehrlich created this revision.Jan 24 2018, 6:29 PM

forgot context

As is there are a few problems but I didn't want to miss another day of review just because I didn't finish this today

  1. There are no tests. This needs several tests. Recommendations on how to test this would be appreciated. I've been locally testing using an ELF I created that has 64k symbols and 128k sections but llvm-objcopy takes 5 seconds on that. I can work on optimizing that binary (smaller file, just the right amount of sections, etc...) but that still seems extreme. Additionally I think testing some combination of removal and such would be nice.
  2. What if an --add-section pushes the section count over to SHN_LORESERVE? should we synthesize a SectionIndexSection?
  3. What if stripping sections pushes the section counter below SHN_LORESERVE? Should we remove the SectionIndexSection because it isn't needed?
  4. I wasn't sure if section indexes of symbols below SHN_LORESERVE should have SHN_XINDEX as their index despite it being possible to directly encode them. Should every symbol have SHN_XINDEX for st_shndx?

Ah, the joys of Prop74/Large ELFs.

As is there are a few problems but I didn't want to miss another day of review just because I didn't finish this today

  1. There are no tests. This needs several tests. Recommendations on how to test this would be appreciated. I've been locally testing using an ELF I created that has 64k symbols and 128k sections but llvm-objcopy takes 5 seconds on that. I can work on optimizing that binary (smaller file, just the right amount of sections, etc...) but that still seems extreme. Additionally I think testing some combination of removal and such would be nice.

I don't think 5 seconds is all that unreasonable, due to the size of the ELF, but it may be worth optimizing anyway. Maybe you could have a little python script that generates a YAML file for yaml2obj (assuming yaml2obj can handle very large ELFs).

  1. What if an --add-section pushes the section count over to SHN_LORESERVE? should we synthesize a SectionIndexSection?
  2. What if stripping sections pushes the section counter below SHN_LORESERVE? Should we remove the SectionIndexSection because it isn't needed?

I think we might want to rebuild this section every time, prior to finalizing the ELF. Effectively, the read-in version uses this section only to interpret section indexes, and then a new one is created from scratch, if needed, afterwards. We could refuse to create it, but I wouldn't recommend that, personally. The gABI definitely suggests that this process should only be followed for larger ELFs, and not for those that don't need it.

  1. I wasn't sure if section indexes of symbols below SHN_LORESERVE should have SHN_XINDEX as their index despite it being possible to directly encode them. Should every symbol have SHN_XINDEX for st_shndx?

I think SHN_XINDEX should only be used where the value can't be represented (i.e. >= SHN_LORESERVE). The ELF gABI says:

"If this member contains SHN_XINDEX, then the actual section header index is too large to fit in this field."

Just to check, which document are you using for the quotes? It would be good to include the URLs to the parts of the documents.

One general point: theoretically, each symbol table could have its own SYMTAB_SHNDX section. The obvious case for this would be the static symbol table and the dynamic symbol table. The corresponding section is indicated by the sh_link field of the SYMTAB_SHNDX section, so we should be using that to establish a link. I think that implies that the SYMTAB_SHNDX section pointer should be a member of the symbol table class, not of the Object.

tools/llvm-objcopy/Object.cpp
72

I'd call this the "symbol section index table" for a bit more clarity.

166

Full stop.

280–282

I think this should have a value SHN_UNDEF unless the index is >= SHN_LORESERVE.

595–602

It seems to me like this block should only happen the first time we encounter a SHN_XINDEX symbol. After that, we should already have the data, and should be able to load from it directly.

603

Somewhere, we should check that the symtab shndx section has the same number of entries as the corresponding symbol table.

704

I'd avoid calling references to the symtab shndx section "Shndx" as that's a common abbreviation for section header index, which could lead to confusion or ambiguity in places. I'd recommend calling it ShndxSection or similar.

863

Comment looks like it's wrapped early?

875

Fix comment wrapping here too.

905–916

I feel like having these quotes duplicated is not great. I wonder whether it's a sign that the null shdr should be constructed in the writeEhdr function, so that the decisions only need making the once? Thoughts?

tools/llvm-objcopy/Object.h
385

I'd rename this member to SectionIndexTable (or ShndxTable)

390

I'd rename this function "setShndxTable" since setShndx implies you're setting the section header index of the section.

  1. Still no tests
  2. changed error message for clarity
  3. fixed typos
  4. populated the section index table with SHN_UNDEF for values < SHN_LORESERVE
  5. computed address of section index table data just once in initSymbolTable
  6. avoided Shndx use for section index table name
  7. I haven't updated "SectionIdexes" to "SectionIndexTable" because I ran out of time. I'll get back to that tomorrow.
tools/llvm-objcopy/Object.cpp
863

I want "SHN_LORESERVE (0xff00)" to stay together which means I have to wrap early. It looksreally odd if you wrap the parenthsies to the next line. Since this is a direct quote I don't want to take the "(0xff00)" out to better resolve that issue.

905–916

I'm a bit conflicted on this one. I like having the logic for writing the whole section header table in one place because it makes the logic of "do we write the section header table out or not" more succinct. I also however agree with your claim that quoting this twice is annoying.

I can think of two options here:

  1. remove the comments in the second case and just leave a URL
  2. move this to writeEhdr just like you said

Changed SectionIndexes to SectionIndexTable everywhere

jhenderson added inline comments.Jan 30 2018, 6:39 AM
tools/llvm-objcopy/Object.cpp
739

properlly -> properly

905–916

I think I'd prefer a third option actually: remove these comments, and replace with a reference to the standard quote in writeEhdr(), i.e. "// See writeEhdr for why we do this" or similar.

1110

This line doesn't read right, and I think you should avoid using "S/shndx" here. Refer to it as the symbol section index table or similar. It also looks like the last sentence is incomplete.

1115

">=" here?

1137

Typo in "SectionIndxes" and missing full-stop.

1153–1154

It's not clear to me what this comment is saying. Why would symbols keep sections updated or not as the case may be?

Hey I'm back!

This change fixes previous comments by James and adds a test. Other tests that are needed yet still

  1. Make sure section index table is removed when not needed
  2. Make sure section index table is added when it is needed
  3. Check error for binary output
  4. Check initialization errors for section index table
  5. To the existing test add checks for the appropriate fields for section index table
  6. Check error for case when symbol has SHN_XINDEX index but not

If you can think of more tests please inform.

  1. Check error for binary output

What error would you expect in this case? Why wouldn't it just work?

  1. Check error for case when symbol has SHN_XINDEX index but not

I think you messed up the end of this sentence :)

How about tests for attempting to strip the SYMTAB_SHNDX table? I.e. We should refuse to do so, if there are too many sections. An interesting edge case is adding sections such that we now require the table, but also explicitly stripping the symtab_shndx table if it exists. I'm not clear on the behaviour here, but I guess we should throw away the old one and build one from scratch in this case.

Can't think of any other cases off the top of my head.

test/tools/llvm-objcopy/many-sections.S
1 ↗(On Diff #135194)

Lower-case .s is more common for test files.

47 ↗(On Diff #135194)

Could this be SECS: Index: 65542?

Similar for symbols.

tools/llvm-objcopy/Object.cpp
661

Unnecessary braces.

905–916

Unfortunately, I think you've gone the wrong way round here! The quote refers to "this member" which only makes sense if it's talking about the Elf Header, so this and the other quote belong in writeEhdr.

1114

Could this loop be improved to not loop over every section? You only need to loop over sections from Index >= SHN_LORESERVE upwards, don't you? Or is this vector not ordered by section index?

1117

break here after this is set.

1123

definitly -> definitely

tools/llvm-objcopy/Object.h
369

The ELF gABI suggests that the name ".symtab_shndx" is used for the SHT_SYMTAB_SHNDX section.

An old commit from 2014 seems to have changed the name emitted by MC from this to ".symtab_shndxr". I'm not sure why the name changed there. Searching for that string online only yields that commit and references to the LLVM code-base, so it doesn't look like there's prior art for that name. Independently of this change, I think finding out why the name was changed (and change it back if appropriate) would be a sensible exercise.

390

Ping on this comment?

394

getShndx() -> getShndxTable(), since getShndx() implies getting the section index for this section, which it doesn't (see also setShndx).

One other point - it would be good to compare the performance of llvm-objcopy to GNU objcopy in this case, and if we're a lot worse, run a profiler at some point to identify what we're doing that makes things worse.

One other point - it would be good to compare the performance of llvm-objcopy to GNU objcopy in this case, and if we're a lot worse, run a profiler at some point to identify what we're doing that makes things worse.

I don't know which cases this happens in but the reason this feature was requested is because under some cases objcopy has O(n^2) behavior for sections. I have one preliminary data point that makes me hopeful which is that the unoptimized debug llvm-objcopy is only about twice as slow as the optimized GNU objcopy is right now. e.g. I think just doing a release build would bump us to comparable performance and then we'd still have room to profile and optimize after that.

emaste added a subscriber: emaste.Feb 21 2018, 3:51 PM

One other point - it would be good to compare the performance of llvm-objcopy to GNU objcopy in this case, and if we're a lot worse, run a profiler at some point to identify what we're doing that makes things worse.

If someone's running performance comparisons it would be good to also include ELF Tool Chain's version.
https://sourceforge.net/p/elftoolchain/wiki/Home/

(We replaced (most of) binutils with ELF Tool Chain in FreeBSD.)

What error would you expect in this case? Why wouldn't it just work?

It depends on the endianness of the system we're writing to so it the BinaryWriter should throw an error if you try to write an allocated one. Later (in a seperate patch) we'll need to support large indexes for dynamic symbols which I plan on handling the general way we've handled dynamic stuff (by just copying). At that point the error will become impossible until we have section editing.

Check error for case when symbol has SHN_XINDEX index but not...

That was meant to be "Check error for case when symbol has SHN_XINDEX index but no symbol index table". There's an error in the code but it isn't checked that its actually thrown in that case at the moment.

How about tests for attempting to strip the SYMTAB_SHNDX table? I.e. We should refuse to do so, if there are too many sections. An interesting edge case is adding sections such that we now require the table, but also explicitly stripping the symtab_shndx table if it exists. I'm not clear on the behaviour here, but I guess we should throw away the old one and build one from scratch in this case.

I think they should be allowed to strip it but if it winds up being needed we should add it in (which is what we do now). So there shouldn't be an error, it should just wind up having a SYMTAB_SHNDX table despite the fact that they stripped it. We should have a test for this case to make sure it's acceptable. We should also have a case for the edge case you mention.

test/tools/llvm-objcopy/many-sections.S
47 ↗(On Diff #135194)

For sections I can do this but for symbols I can't. Also I'm glad you had me do this test. Somewhere in the output of llvm-readobj there is an extra occurence of "Name" than there should be. The largest section index is 65540 which means that there should be 65541 occurrences of "Name:" which means this test had an issue with it. It turns out that at the start of every llvm-readobj output there is a field "LoadName:" which triggers the count on this. So instead I'll use the Index number for sections and "Symbol {" count for symbols.

tools/llvm-objcopy/Object.cpp
1114

You have this right. It can also exit early if it finds one because we'll have to reassign indexes later anyway. Good catch.

I spent some time today figuring out how to a) improve the time it takes the test to run and b) make the uploaded binary as small as possible. I'm uploading a binary so that not every test has to regenerate the same file (which dominates the running time of generating this file). I tried to make this binary as small and as compressible as possible. I got the compressed archive down to 147 kb which I think is acceptable. On my machine the total running time of decompressing, running llvm-objcopy, and then checking the data, is about 1.6 seconds which is acceptably fast in my opinion. This problem was much more solvable than when I implemented 64-bit symbol offsets for archives.

For some details on the file:

  1. it has 65541 sections, 65537 symbols
  2. every symbol is named "x" but is technically a unique section. Each symbol is defined in a different section
  3. every symbol is named "x" but is technically a unique section.

The uncompressed binary is about 6 MB which is about a 98% compression ratio. I can make the whole thing a bit smaller if I only use a small number of symbols though.

You can download the zip file that I'm using here: https://drive.google.com/file/d/1u6W1mUHkFBPsLzEV4u50M_BeiPKMyipM/view?usp=sharing

I'll follow up later with the other mentioned tests. The plan is to have them each uncompress the binary.

Thanks for this. It demonstrates some particularly poor behaviour in ELF Tool Chain's elfcopy (so, if anyone's comparing objcopy versions, there's not much value in including ELF Tool Chain right now).

Code looks okay to me, aside from a couple of comment nits. I'll hold off approving though, since I'd like to see the other tests.

tools/llvm-objcopy/Object.cpp
875

Weird wrapping again?

1130

Have you wrapped slightly early here? I think we can get this comment down to 2 lines.

Thanks for this. It demonstrates some particularly poor behaviour in ELF Tool Chain's elfcopy (so, if anyone's comparing objcopy versions, there's not much value in including ELF Tool Chain right now).

I had to commit many atrocities to make that file. I actually have a desire for a yaml2obj like tool that would let me construct ELFs more succinctly. That binary was produced by getting as close as I could with a generated .s and then using many different tools to tweak the results. It wasn't pretty.

  1. Make sure section index table is removed when not needed
  2. Make sure section index table is added when it is needed
  3. Check error for binary output
  4. Check initialization errors for section index table
  5. To the existing test add checks for the appropriate fields for section index table
  6. Check error for case when symbol has SHN_XINDEX index but no section index table exists

So these tests are hard to fulfill. 1. is done. 2. isn't possible because there is no way to a) add a symbol to a section or b) add a section before another section. As soon as symbols can be added or section position can be edited this will need to be tested. 3. isn't possible because there isn't really a way (short of using dd to make a binary modification) to make the section header table allocated. Also eventually dynamic section header tables will need to be supported and such a test will become impossible until section editing becomes a thing because allocated section header tables will be treated the way dynamic sections will be. Checking the initialization error also requires binary editing the link of the section index table (Again once we can edit sections, that will be a proper test). I added the extra checks for the section header index fields. It's not an ideal test but to test that

So unless you have other test ideas I think these tests are the best I can do. They're better than a basic smoke test but they're not 100% complete.

Added tests. I think this is about as good as it's going to get without adding other features to llvm-objcopy.

Two comment wrapping issues are still outstanding (I've pinged them).

It's not an ideal test but to test that

I think you forgot to end a sentence :) Also I don't see any comment about number 6?

test/tools/llvm-objcopy/many-sections.test
42

SHT_SYMTAB_SHNDX?

47

I think checking the size here would be wise. It should be easy to do, since you know how many symbols there are. A comment describing the calculation would be helpful too, i.e. something like "# Size == 4 * symbol count".

tools/llvm-objcopy/Object.cpp
875

Ping

1130

Ping

  1. Fixed wrapping
  2. Added size of .symtab_shndx to test
  3. Fixed section type so it has the full type name

It's not an ideal test but to test that

I think you forgot to end a sentence :)

I don't remember exactly what I wanted to say but basically I'm not sure adding more tests is worth it because there are many challenges in adding this sort test.

Also I don't see any comment about number 6?

It requires uploading another binary which isn't ideal. It's also hard to produce.

This revision is now accepted and ready to land.Mar 7 2018, 1:28 AM
This revision was automatically updated to reflect the committed changes.