This is an archive of the discontinued LLVM Phabricator instance.

loading a binary at a slide multiple times leaves old entries in the SectionLoadList
ClosedPublic

Authored by jasonmolenda on Jul 25 2022, 4:56 PM.

Details

Summary

If you add a binary to lldb and do target modules load -s slide -f name multiple times, the old entries are not cleared from the SectionLoadList's addr->section map, leading to confusing lldb behavior.

SectionLoadList has a section-to-address map (m_sect_to_addr) and an address to section map (m_addr_to_sect) that are registered with the Target. SectionLoadList::SetSectionLoadAddress updates the entry in the m_sect_to_addr map if it exists, or adds it. And it adds a new entry to m_addr_to_sect as long as there isn't a section at that address already. But it never removes the old entry from m_addr_to_sect.

This results in each section having multiple entries, with each the load addresses that had been set previously, so load address -> file address translation behaves very oddly.

Diff Detail

Event Timeline

jasonmolenda created this revision.Jul 25 2022, 4:56 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 25 2022, 4:56 PM
jasonmolenda requested review of this revision.Jul 25 2022, 4:56 PM

I should have added, this bug has been in lldb probably forever - at least for a few years.

Also, this test case explicitly builds without debug information because debug information (on Darwin) results in a different bug being hit, where these data symbols are not updated when target modules load -s slide is done, somehow. The symbol table from the debug info must not be updated, I don't know, I filed a little bug on myself to look into that. This has also behaved this way for years as well, not a recent regression.

DavidSpickett added a subscriber: DavidSpickett.
DavidSpickett added inline comments.
lldb/test/API/functionalities/multiple-slides/TestMultipleSlides.py
35

Is it worth checking here that the value of &first has actually changed, or is sliding in general tested elsewhere?

lldb/test/API/functionalities/multiple-slides/main.c
3

2048 is a random number or are you fitting into some page size or just a distance big enough to prevent some accidentally correct but wrong behaviour by lldb?

jasonmolenda added inline comments.Jul 26 2022, 1:12 PM
lldb/test/API/functionalities/multiple-slides/TestMultipleSlides.py
35

Good idea, that's not a bug we're seeing here (that's file address -> load address translation, if anything) but I mention that this test case doesn't work when there's debug info on Darwin; one way you can see the problem with debug info is that the address of these variables is not shown as changing, when the slide is applied.

lldb/test/API/functionalities/multiple-slides/main.c
3

The original bug report was disassembling functions, and it was easy to see that the load address -> file address calculation was incorrect. For the test case I laid out the DATA and slides so I could repo this, but didn't show my work for how I got there. If we have a file address for DATA of 0x104000, first is at fileaddr 0x104000 second is at fileaddr 0x00104800.

We add a slide of 1990 (0x7c6), so now DATA is at load address 0x1047c6. When I print first, that turns into a load address of 0x1047c6, translates to a DATA + 0 Address object, and when I print second, that turns into a load address of 0x00104fc6 which translates to DATA + 0x2048.

Then we slide to offset 4, so now DATA is at load address 0x104004. But we still have the old 0x1047c6 load address in the table. first is load address 0x104004 matches the entry for 0x104004, so it goes to DATA + 4. second is load address 0x1047cc. This matches the old 0x1047c6 entry in the addr->sect table, so it translates to DATA + 6, and now we're indexing into the middle of the first array.

I'll be honest, I didn't do the math on this when I made the test case, I just guessed at some numbers that would probably overlap in the right way, so it wasn't completely clear to me either, haha.

Update test to address David's feedback, make it clearer how I picked the slide values and also check that the load address of the symbols are updated properly as the binary is slid.

One small comment on the updated test: surprisingly SBSymbol doesn't have a GetSymbolSize(), probably because it's a computed value and it's possible that we could not compute it, or the heuristic may be incorrect in a stripped binary, so I get the start & end address and calculate it manually. Symbol byte sizes are calculated by lldb by looking at the next symbol's file address in the symbol table, or the end of the section.

DavidSpickett added inline comments.Jul 27 2022, 1:54 AM
lldb/test/API/functionalities/multiple-slides/TestMultipleSlides.py
58

So here I expect to see -64 somewhere in the address check, is that this other bug you're talking about?

So we get the change from no known load address, to some load address, but then a further slide doesn't change it?

Getting back to this old patch. When I lost track of it, @DavidSpickett was saying that the test case was still not clear -- I was loading a section at an address, checking that I could read values out of arrays, then setting the section to a different address and seeing that I could still read the value out of the arrays. I added comments to these two key parts of the test file,

# View the first element of `first` and `second` with
# no slide applied, but with load address set.
#
# In memory, we have something like
#    0x1000 - 0x17ff  first[]
#    0x1800 - 0x1fff  second[]
target.SetModuleLoadAddress(module, 0)
self.expect("p/d ((int*)&first)[0]", substrs=['= 5'])
self.expect("p/d ((int*)&second)[0]", substrs=['= 6'])
self.assertEqual(first_sym.GetStartAddress().GetLoadAddress(target), 
                 first_sym.GetStartAddress().GetFileAddress())
self.assertEqual(second_sym.GetStartAddress().GetLoadAddress(target),
                 second_sym.GetStartAddress().GetFileAddress())

# Slide it a little bit less than the size of the first array.
#
# In memory, we have something like
#    0xfc0 - 0x17bf  first[]
#    0x17c0 - 0x1fbf second[]
#
# but if the original entries are still present in lldb, 
# the beginning address of second[] will get a load address
# of 0x1800, instead of 0x17c0 (0x1800-64) as we need to get.
target.SetModuleLoadAddress(module, first_size - 64)
self.expect("p/d ((int*)&first)[0]", substrs=['= 5'])
self.expect("p/d ((int*)&second)[0]", substrs=['= 6'])
self.assertNotEqual(first_sym.GetStartAddress().GetLoadAddress(target), 
                 first_sym.GetStartAddress().GetFileAddress())
self.assertNotEqual(second_sym.GetStartAddress().GetLoadAddress(target),
                 second_sym.GetStartAddress().GetFileAddress())

The inferior is never actually run, so I can change the load addresses of the sections around and it ultimately should read the data out of the file on disk every time.

And to be clear, the problem is that we start out with an expression like first[0] which we find the Section + address for first, then resolve to a load address (addr_t) via the Target SectionLoadList, and then try to read memory from that address - at which point we go the other direction, converting the addr_t into a Section+offset via the Target SectionLoadList. When we have the old section entries in SectionLoadList, we don't end up with the same Section+offset as we started on -- one of the old entries is picked up (if you slide it the right way - like the test does) and you read data from the wrong address in the file.

DavidSpickett accepted this revision.Sep 22 2022, 1:04 AM

Test is certainly clearer. LGTM.

This revision is now accepted and ready to land.Sep 22 2022, 1:04 AM

FWIW the intent of the change was always clear but writing a test for it is always going to be fiddly. My usual mindset is "what if this fails out of the blue", so I'm always asking for more comments :) .

This revision was landed with ongoing or failed builds.Sep 27 2022, 4:20 PM
This revision was automatically updated to reflect the committed changes.