This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Fill .symtab_shndx section correctly
ClosedPublic

Authored by evgeny777 on Apr 11 2019, 1:33 AM.

Diff Detail

Event Timeline

evgeny777 created this revision.Apr 11 2019, 1:33 AM

Test case?

tools/llvm-objcopy/ELF/Object.cpp
1669

that why -> so

1670

fill index table -> fill the index table

tools/llvm-objcopy/ELF/Object.h
484

Is this assert correct? What happens if you have a very large object with many sections, but no symbols. Will that result in an empty SectionIndexSection?

Assuming it's correct, I would find it more readable for this to be assert(Size > 0);

Test case?

Have you seen bug description?
The problem with test case is that it requires file with large number of sections of which at least some should have relocation sections.
I have 19Mb object file generated by clang, but I'm not sure if using it as test case is a good idea ...

evgeny777 marked an inline comment as done.Apr 11 2019, 2:38 AM
evgeny777 added inline comments.
tools/llvm-objcopy/ELF/Object.h
484

Is this assert correct? What happens if you have a very large object with many sections, but no symbols

This function will never be called then, because Index is actually index of a section where symbol is defined.

Assuming it's correct, I would find it more readable for this to be assert(Size > 0);

Makes sense

Test case?

Have you seen bug description?
The problem with test case is that it requires file with large number of sections of which at least some should have relocation sections.
I have 19Mb object file generated by clang, but I'm not sure if using it as test case is a good idea ...

Sorry, I skipped over that somehow (I think I jumped straight from the bug). We already have at least one test that uses many sections (see many-sections.test). That unzips a pre-built object to achieve this. One option might be to update that object file (and possibly the test), if it can satisfy the prerequisites you need.

That was helpful, thanks! I'll update diff shortly

Test case?

Have you seen bug description?
The problem with test case is that it requires file with large number of sections of which at least some should have relocation sections.
I have 19Mb object file generated by clang, but I'm not sure if using it as test case is a good idea ...

Sorry, I skipped over that somehow (I think I jumped straight from the bug). We already have at least one test that uses many sections (see many-sections.test). That unzips a pre-built object to achieve this. One option might be to update that object file (and possibly the test), if it can satisfy the prerequisites you need.

Could we use llvm-mc and some loop macros to generate a binary with the required amount of sections?
I did that for a testcase that needed 32768 symbols: https://github.com/llvm/llvm-project/blob/master/lld/test/ELF/mips-out-of-bounds-call16-reloc.s

evgeny777 updated this revision to Diff 194655.Apr 11 2019, 3:18 AM

Added test case, addressed comments.

jhenderson accepted this revision.Apr 12 2019, 2:33 AM

LGTM.

Could we use llvm-mc and some loop macros to generate a binary with the required amount of sections?
I did that for a testcase that needed 32768 symbols: https://github.com/llvm/llvm-project/blob/master/lld/test/ELF/mips-out-of-bounds-call16-reloc.s

Good tip, thanks. It's probably worth bearing in mind in case we have any more tests along these lines that the existing pre-canned binary can't be used for.

test/tools/llvm-objcopy/ELF/many-sections.test
56 ↗(On Diff #194655)

Looks to me like you've hit a bug here. This should actually be index 0xff01 (going via SHN_XINDEX), not PRC[0xff01] which is a reserved value. readelf will print 65281 here. Could you file a bug against llvm-readobj, please?

You might want to consider using llvm-readobj instead of llvm-readelf, as this doesn't have that bug.

This revision is now accepted and ready to land.Apr 12 2019, 2:33 AM

Looks to me like you've hit a bug here

Are you sure it's a bug? According to sources it looks like some extension to original readelf output:

// Find if:
// Processor specific
if (SectionIndex >= ELF::SHN_LOPROC && SectionIndex <= ELF::SHN_HIPROC)
  return std::string("PRC[0x") +
         to_string(format_hex_no_prefix(SectionIndex, 4)) + "]";
// OS specific
if (SectionIndex >= ELF::SHN_LOOS && SectionIndex <= ELF::SHN_HIOS)
  return std::string("OS[0x") +
         to_string(format_hex_no_prefix(SectionIndex, 4)) + "]";
// Architecture reserved:
if (SectionIndex >= ELF::SHN_LORESERVE &&
    SectionIndex <= ELF::SHN_HIRESERVE)
  return std::string("RSV[0x") +
         to_string(format_hex_no_prefix(SectionIndex, 4)) + "]";

Anyway I'll switch to llvm-readobj in the test

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 12 2019, 4:59 AM

Looks to me like you've hit a bug here

Are you sure it's a bug? According to sources it looks like some extension to original readelf output:

// Find if:
// Processor specific
if (SectionIndex >= ELF::SHN_LOPROC && SectionIndex <= ELF::SHN_HIPROC)
  return std::string("PRC[0x") +
         to_string(format_hex_no_prefix(SectionIndex, 4)) + "]";
// OS specific
if (SectionIndex >= ELF::SHN_LOOS && SectionIndex <= ELF::SHN_HIOS)
  return std::string("OS[0x") +
         to_string(format_hex_no_prefix(SectionIndex, 4)) + "]";
// Architecture reserved:
if (SectionIndex >= ELF::SHN_LORESERVE &&
    SectionIndex <= ELF::SHN_HIRESERVE)
  return std::string("RSV[0x") +
         to_string(format_hex_no_prefix(SectionIndex, 4)) + "]";

Anyway I'll switch to llvm-readobj in the test

Yes, it's a bug. Where the original st_shndx field is between SHN_LORESERVE and SHN_HIRESERVE (and not SHN_XINDEX), it should be decorated in this way. In all other cases it shouldn't be. It's a little complicated to explain in detail, but I'll do my best. If we look at the section header table, there are sections with index 0xff00, 0xff01, etc. Thus when we generally refer to sections with index 0xff00 etc we are talking about those sections. The symbol st_shndx field (i.e. the field indicating the symbol's section) however is only 16 bits, and the upper few values are reserved for various purposes. The SHN_XINDEX value (0xffff) in this context means look up the section index in the SHT_SYMTAB_SHNDX section instead. Other values in this range have other meanings (e.g. absolute and common symbols). If the real section index for the symbol's section is >= SHN_LORESERVE, it is impossible to represent it in directly in st_shndx, so it will be SHN_XINDEX.

llvm-readelf should print the real section index in the section index column, or a special interpretation where the value in the st_shndx field is a reserved value, such as SHN_ABS. The code in llvm-readelf should only do that special interpretation for the raw st_shndx field value, not the interpreted value (i.e. the value after lookups in SHT_SYMTAB_SHNDX) which I think is what it is doing (though I haven't looked in depth at the code yet to be 100% sure).

bd1976llvm added inline comments.
llvm/trunk/tools/llvm-objcopy/ELF/Object.cpp
492 ↗(On Diff #194843)

Let's stick to SectionIndexTable =! nullptr given line 501