This is an archive of the discontinued LLVM Phabricator instance.

Fix lookup of symbols at the same address with no size vs. size
Changes PlannedPublic

Authored by jankratochvil on Jan 13 2020, 3:17 AM.

Details

Reviewers
labath
omjavaid
Summary

The Fedora problem has been fixed by D63540.
But as reported by @omjavaid it regressed arm32: Ubuntu Xenial, Bionic and Debian Buster
I found it also reproducible with on Fedora in chroot with ubuntu-18.04-server-cloudimg-armhf.
The regression is due to:

GetAddressClass fails to recognized 0x102f0 as a code address:

PASS:
(lldb) p (void)sync()
GetAddressClass:0x102f1
GetAddressClass:0x102f1=(null) ValueIsAddress=1 section_type=1
GetAddressClass:0x96040
GetAddressClass:0x96040=__mmap ValueIsAddress=1 section_type=1
GetAddressClass:0x102f1
GetAddressClass:0x102f1=(null) ValueIsAddress=1 section_type=1
GetAddressClass:0x102f0
GetAddressClass:0x102f0=(null) ValueIsAddress=1 section_type=1
...

FAIL:
(lldb) p (void)sync()
GetAddressClass:0x102f1
GetAddressClass:0x102f1=_start ValueIsAddress=1 section_type=1
GetAddressClass:0x96040
GetAddressClass:0x96040=__mmap ValueIsAddress=1 section_type=1
GetAddressClass:0x102f1
GetAddressClass:0x102f1=_start ValueIsAddress=1 section_type=1
GetAddressClass:0x102f0
...

That is due to:

symtab.fail:[   11]     12     Invalid         0x00000000000102f0                    0x0000000000000000 0x00000003
symtab.fail:[   66]     99   X Code            0x00000000000102f0                    0x0000000000000030 0x00000012 _start
symtab.pass:[   11]     12     Invalid         0x00000000000102f0                    0x0000000000000030 0x00000003
symtab.pass:[   66]     99   X Code            0x00000000000102f0                    0x0000000000000030 0x00000012 _start

The difference is in the 'Invalid' symbol which is:

Num:    Value  Size Type    Bind   Vis      Ndx Name
 12: 000102f0     0 SECTION LOCAL  DEFAULT   12

Apparently ARM32 relies on that section symbol to have proper size. I do not see how Symtab::InitAddressIndexes could handle STT_SECTION in a special way as that is ELF-type specific Symbol characteristics:

uint32_t m_flags; // A copy of the flags from the original symbol table, the
                  // ObjectFile plug-in can interpret these

Diff Detail

Event Timeline

jankratochvil created this revision.Jan 13 2020, 3:17 AM
clayborg added inline comments.
lldb/source/Symbol/Symtab.cpp
898–900

So if we have N symbols for address 0x1000 with size of zero and we have one with a valid size, do we leave all symbols with zero size stay with a size of zero? I was confused by the comment and wanted to clarify.

jankratochvil marked an inline comment as done.Jan 13 2020, 12:13 PM
jankratochvil added inline comments.
lldb/source/Symbol/Symtab.cpp
898–900

Yes. It sets size for at most one of the zero-sized symbols and it does that only in the case there is no sizeful symbol between them.
Maybe it should set all the zero-sized symbols in such case.
But that still would not help the arm32 regression as in that case there is one section-type zero-sized symbol and one sizeful regular symbol.

Apparently ARM32 relies on that section symbol to have proper size. I do not see how Symtab::InitAddressIndexes could handle STT_SECTION in a special way as that is ELF-type specific Symbol characteristics:

Right, so I think that the thing we need to fix is the ObjectFileELF::GetAddressClass implementation. There's no requirement that this function needs to be implemented on top of the Symtab class. In fact, this isn't really true even now, as the function falls back on the hand-maintained map in m_address_class_map if the base implementation fails. Maybe we ought to move this logic up front, so that it takes precedence over the base implementation? I'm not sure about that, as I am not that familiar with how the arm/thumb determination is supposed to work, but @omjavaid might be able to help here: How are the arm/thumb mapping symbols supposed to work exactly? What's the "official" algorithm for checking whether a piece of the .text section is arm or thumb?